Systems, apparatus, and methods for robotic learning and execution of skills

ABSTRACT

Systems, apparatus, and methods are described for robotic learning and execution of skills. A robotic apparatus can include a memory, a processor, sensors, and one or more movable components (e.g., a manipulating element and/or a transport element). The processor can be operatively coupled to the memory, the movable elements, and the sensors, and configured to obtain information of an environment, including one or more objects located within the environment. In some embodiments, the processor can be configured to learn skills through demonstration, exploration, user inputs, etc. In some embodiments, the processor can be configured to execute skills and/or arbitrate between different behaviors and/or actions. In some embodiments, the processor can be configured to learn an environmental constraint. In some embodiments, the processor can be configured to learn using a general model of a skill.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No.17/504,130, entitled “SYSTEMS, APPARATUS, AND METHODS FOR ROBOTICLEARNING AND EXECUTION OF SKILLS,” filed Oct. 18, 2021, which is acontinuation of U.S. patent application Ser. No. 16/456,919, entitled“SYSTEMS, APPARATUS, AND METHODS FOR ROBOTIC LEARNING AND EXECUTION OFSKILLS,” filed Jun. 28, 2019, now U.S. Pat. No. 11,148,288, issued Oct.19, 2021, which claims priority to U.S. Provisional Application No.62/723,694, entitled “SYSTEMS, APPARATUS, AND METHODS FOR ROBOTICLEARNING AND EXECUTION OF SKILLS,” filed Aug. 28, 2018. U.S. patentapplication Ser. No. 16/456,919 also is a continuation-in-part ofInternational PCT Application No. PCT/US2018/019520, entitled “SYSTEMS,APPARATUS, AND METHODS FOR ROBOTIC LEARNING AND EXECUTION OF SKILLS,”filed Feb. 23, 2018, which claims priority to U.S. ProvisionalApplication No. 62/463,628, entitled “METHOD AND SYSTEM FOR ROBOTICLEARNING OF EXECUTION PROCESSES RELATED TO PERCEPTUALLY CONSTRAINEDMANIPULATION SKILLS,” filed Feb. 25, 2017, and U.S. ProvisionalApplication No. 62/463,630, entitled “METHOD AND SYSTEM FOR ROBOTICEXECUTION OF A PERCEPTUALLY CONSTRAINED MANIPULATION SKILL LEARNED VIAHUMAN INTERACTION,” filed Feb. 25, 2017.

The disclosures of each of the above-referenced applications areincorporated by reference herein in their entirety.

GOVERNMENT SUPPORT

This invention was made with U.S. government support under Grant Nos.1621651 and 1738375 awarded by the National Science Foundation under theSmall Business Innovation Research Program Phase I. The U.S. governmenthas certain rights in the invention.

TECHNICAL FIELD

The present disclosure relates generally to systems, apparatus, andmethods for robotic learning and execution of skills. More specifically,the present disclosure relates to a robotic apparatus capable oflearning and executing skills in unstructured environments.

BACKGROUND

Robots can be used to perform and automate a variety of tasks. Robotscan perform tasks by moving through an environment, such as an officebuilding or a hospital. Robots can be equipped with wheels, tracks, orother mobile components that enable them to move autonomously around anenvironment. Robots that do not have an arm or other manipulator,however, cannot manipulate objects in the environment. Therefore, theserobots are limited in their ability to perform tasks, e.g., such robotsmay not be able to pick up and deliver an object without a human beingpresent to manipulate the object at a pick up or delivery point.

Robots that include an arm or other manipulator may be able to pick upand deliver objects to a location without a human being present. Forexample, a robot that has an arm with an end effector, such as agripper, can use the gripper to pick up one or more objects fromdifferent locations and deliver those objects to a new location, allwithout human assistance. These robots can be used to automate certaintasks, thereby allowing human operators to focus on other tasks.However, most commercial robots do not include manipulators due to thechallenge and complexity of programming the operation of a manipulator.

Moreover, most commercial robots are designed to operate in structuredenvironments, e.g., a factory, a warehouse, etc. Unstructuredenvironments, e.g., environments involving humans, such as hospitals andhomes, can impose additional challenges for programming a robot. Inunstructured environments, robots cannot rely on complete knowledge oftheir surrounding environment but must be able to perceive changes intheir surrounding environment and adapt based on those changes. Thus, inunstructured environments, robots have to continuously acquireinformation about the environment to be able to make autonomousdecisions and perform tasks. Oftentimes, the movements of a robot, e.g.,the movements of an arm or end effector in the environment, are alsoconstrained by objects and other obstacles in the environment, furtheradding to the challenge of robot perception and manipulation. Given theuncertain and dynamic nature of unstructured environments, robotstypically cannot be pre-programmed to perform tasks.

Accordingly, there is a need for robotic systems that can perceive andadapt to dynamic and unstructured environments, and can perform taskswithin those environments, without relying on pre-programmedmanipulation skills.

SUMMARY

Systems, apparatus, and methods are described for robotic learning andexecution of skills. In some embodiments, an apparatus includes amemory, a processor, a manipulating element, and a set of sensors. Theprocessor can be operatively coupled to the memory, the manipulatingelement, and the set of sensors, and configured to: obtain, via a subsetof sensors from the set of sensors, a representation of an environment;identify a plurality of markers in the representation of theenvironment, each marker from the plurality of markers associated with aphysical object from a plurality of physical objects located in theenvironment; present information indicating a position of each markerfrom the plurality of markers in the representation of the environment;receive a selection of a set of markers from the plurality of markersassociated with a set of physical objects from the plurality of physicalobjects; obtain, for each position from a plurality of positionsassociated with a motion of the manipulating element in the environment,sensory information associated with the manipulating element, where themotion of the manipulating element is associated with a physicalinteraction between the manipulating element and the set of physicalobjects; and generate, based on the sensory information, a modelconfigured to define movements of the manipulating element to executethe physical interaction between the manipulating element and the set ofphysical objects.

In some embodiments, the manipulating element can include a plurality ofjoints and an end effector. In some embodiments, the set of physicalobjects can include a human.

In some embodiments, the plurality of markers are fiducial markers, andthe representation of the environment is a visual representation of theenvironment.

In some embodiments, two or more markers from the plurality of markerscan be associated with one physical object from the set of physicalobjects. Alternatively or additionally, in some embodiments, one markerfrom the plurality of markers can be associated with two or morephysical objects from the set of physical objects.

In some embodiments, a method includes: obtaining, via a set of sensors,a representation of an environment; identifying a plurality of markersin the representation of the environment, each marker from the pluralityof markers associated with a physical object from a plurality ofphysical objects located in the environment; presenting informationindicating a position of each marker from the plurality of markers inthe representation of the environment; receiving, after the presenting,a selection of a set of markers from the plurality of markers associatedwith a set of physical objects from the plurality of physical objects;obtaining, for each position from a plurality of positions associatedwith a motion of a manipulating element in the environment, sensoryinformation associated with the manipulating element, where the motionof the manipulating element is associated with a physical interactionbetween the manipulating element and the set of physical objects; andgenerating, based on the sensory information, a model configured todefine movements of the manipulating element to execute the physicalinteraction between the manipulating element and the set of physicalobjects.

In some embodiments, the method further includes receiving a selectionof a first subset of features from the set of features, where the modelis generated based on sensor data associated with the first subset offeatures and not based on sensor data associated with a second subset offeatures from the set of features not included in the first set offeatures.

In some embodiments, a method includes: obtaining, via a set of sensors,a representation of an environment; identifying a plurality of markersin the representation of the environment, each marker from the pluralityof markers associated with a physical object from a plurality ofphysical objects located in the environment; presenting informationindicating a position of each marker from the plurality of markers inthe representation of the environment; in response to receiving aselection of a set of markers from the plurality of markers associatedwith a set of physical objects from the plurality of physical objects,identifying a model associated with executing a physical interactionbetween a manipulating element and the set of physical objects, themanipulating element including a plurality of joints and an endeffector; and generating, using the model, a trajectory for themanipulating element that defines movements of the plurality ofjointsand the end effector associated with executing the physical interaction.

In some embodiments, the method further includes: displaying to a userthe trajectory for the manipulating element in the representation of theenvironment; receiving, after the displaying, an input from the user;and in response to the input indicating an acceptance of the trajectoryfor the manipulating element, implementing the movements of theplurality of joints and the end effector to execute the physicalinteraction.

In some embodiments, the model is associated with (i) a stored set ofmarkers, (ii) sensory information indicating at least one of a positionor an orientation of the manipulating element at points along a storedtrajectory of the manipulating element associated with the stored set ofmarkers, and (iii) sensory information indicating a configuration of theplurality of joints at the points along the stored trajectory. Themethod of generating the trajectory for the manipulating elementincluding: computing a transformation function between the set ofmarkers and the stored set of markers; transforming, for each point, theat least one of the position or the orientation of the manipulatingelement using the transformation function; determining, for each point,a planned configuration of the plurality of joints based on theconfiguration of the plurality ofjoints at the points along the storedtrajectory; and determining, for each point, a portion of the trajectorybetween that point and a consecutive point based on the plannedconfiguration of the plurality of joints for that point.

In some embodiments, a robotic device is configured to learn and executeskills associated with a transport element. A transport element can be,for example, a set of wheels, tracks, crawlers, or other suitable devicethat enables movement of a robotic device within an environment, such asfrom a first room to a second room, or between floors of a building. Therobotic device can be configured to learn skills associated with thetransport element by obtaining sensory information associated with thetransport element, an environment surrounding the robotic device, and/oranother component of the robotic device, when the robotic device ismoved from a first location to a second location, e.g., through adoorway or by an object such as a human. The sensory information can berecorded at specific points during the movement of the robotic device,e.g., at a keyframe. The robotic device can be configured to generate,based on the sensory information, a model configured to define movementsof the transport elements (or other components of the robotic device,e.g., a manipulating element) to execute the movement of the roboticdevice from the first location to the second location.

In some embodiments, a robotic device is configured to learn aspecialized skill with a generic version of a skill. The robotic devicecan initiate execution of the generic skill in a specific environmentand pause the execution when the robotic device reaches a part of theexecution that requires specialization of the skill in the specificenvironment. The robotic device can then prompt a user for ademonstration of that part of the skill. The robotic device can continueto execute the skill and/or prompt a user for demonstrations ofparticular parts of the skill until the skill is completed. The roboticdevice then adapts the generic skill based on the demonstration of thespecific parts of the skill to generate a specialized model forexecuting the skill.

In some embodiments, a robotic device is configured to learnenvironmental constraints. The robotic device can record informationabout its surrounding environment or objects within that environment,and obtain general knowledge about the environment. The robotic devicecan apply this general knowledge to a set of models that relate todifferent skills that are executed within the environment.

In some embodiments, a robotic device is capable of using behaviorarbitration to act continuously and autonomously within a dynamicenvironment such as a human social environment. The robotic device mayhave various resources or components, e.g., a manipulation element, atransport element, a head, a camera, etc., and can apply arbitrationalgorithms to determine when and how to use those resources. Thearbitration algorithms can define a set of rules that enables differentactions or behaviors to be prioritized based on the information that isbeing captured at the robotic device. The robotic device can beconfigured to continuously decide between executing different actionsand behaviors (and therefore use different resources or components), asnew information is captured at the robotic device. As part of thebehavior arbitration, the robotic device can be configured to handleinterruptions to an action and switch to a new action. By continuouslymonitoring itself and its surrounding environment and performingcontinuous arbitration, the robotic device can engage in sociallyappropriate behavior continually over time.

In some embodiments, a robotic device is configured to receive socialcontext information associated with an environment and combine thatinformation with a navigational map of the environment (e.g., layer thesocial context information on top of the navigational map). The roboticdevice can be provided with the social context information by a user(e.g., a human operator), or the robotic device can capture the socialcontext information through interactions within the environment. Therobotic device can refine the social context information as the roboticdevice engages in interactions with humans within an environment overtime. In some embodiments, the robotic device can be configured toassociate certain types of behaviors or actions with particularlocations within the environment based on the social context informationassociated with those locations.

In some embodiments, a robotic device is configured to interact with ahuman operator, either on site or via a network connection to a remotedevice operated by the human operator. The human operator is referred toherein as a “robot supervisor,” and can use various software-based toolsto inform the robotic device of information regarding its surroundingenvironment and/or instruct the robotic device to perform certainactions. These actions can include, for example, navigation behaviors,manipulation behaviors, head behaviors, sounds, lights, etc. The robotsupervisor can control the robotic device to collect information duringlearning or execution of an action and/or to tag certain behavior aspositive or negative behavior such that the robotic device can use thatinformation to improve its ability to arbitrate among different actionsor to execute a particular action in the future.

Other systems, processes, and features will become apparent to thoseskilled in the art upon examination of the following drawings anddetailed description. It is intended that all such additional systems,processes, and features be included within this description, be withinthe scope of the present invention, and be protected by the accompanyingclaims.

BRIEF DESCRIPTION OF THE DRAWINGS

The skilled artisan will understand that the drawings primarily are forillustrative purposes and are not intended to limit the scope of theinventive subject matter described herein. The drawings are notnecessarily to scale; in some instances, various aspects of theinventive subject matter disclosed herein may be shown exaggerated orenlarged in the drawings to facilitate an understanding of differentfeatures. In the drawings, like reference characters generally refer tolike features (e.g., functionally similar and/or structurally similarelements).

FIG. 1 is a block diagram illustrating a configuration of systemincluding a robotic device, according to some embodiments.

FIG. 2 is a block diagram illustrating a configuration of a roboticdevice, according to some embodiments.

FIG. 3 is a block diagram illustrating a configuration of a control unitassociated with a robotic device, according to some embodiments.

FIG. 4 is a schematic illustration of a manipulating element of arobotic device, according to some embodiments.

FIG. 5 is a schematic illustration of a robotic device, according tosome embodiments.

FIGS. 6A and 6B are schematic illustrations of objects in an environmentviewed by a robotic device, according to some embodiments.

FIGS. 7A and 7B are schematic illustrations of objects in an environmentviewed by a robotic device, according to some embodiments.

FIG. 8 is a flow diagram illustrating a method of scanning anenvironment performed by a robotic device, according to someembodiments.

FIG. 9 is a flow diagram illustrating a method of learning and executionof skills performed by a robotic device, according to some embodiments.

FIG. 10 is a flow diagram illustrating a method of learning a skillperformed by a robotic device, according to some embodiments.

FIG. 11 is a flow diagram illustrating a method of executing a skillperformed by a robotic device, according to some embodiments.

FIG. 12 is a block diagram showing a system architecture for roboticlearning and execution, including user actions, according to someembodiments.

FIG. 13 is a flow diagram illustrating an operation of a robotic devicewithin an environment, according to some embodiments.

FIG. 14 is a flow diagram illustrating a method of requesting andreceiving input from a user, such as a robot supervisor, according tosome embodiments.

FIG. 15 is a flow diagram illustrating a method of learning skills andenvironmental constraints performed by a robotic device, according tosome embodiments.

FIG. 16 is a flow diagram illustrating a method of learning a skill froma generic skill model performed by a robotic device, according to someembodiments.

FIG. 17 is a flow diagram illustrating a method of learning anenvironmental constraint performed by a robotic device, according tosome embodiments.

FIG. 18 is a block diagram illustrating an example of components of arobotic device performing behavior arbitration, according to someembodiments.

FIG. 19 is a schematic illustration of layers of a map of an environmentgenerated by a robotic device, according to some embodiments.

FIG. 20 is a block diagram illustrating a configuration of a controlunit associated with a robotic device, according to some embodiments.

FIG. 21 depicts flow of information that a robotic device provides toand received from a map maintained by the robotic device, according tosome embodiments.

FIG. 22 depicts a flow diagram illustrating different learning modes ofa robotic device, according to some embodiments.

FIGS. 23-25 depict flow diagrams illustrating example learning behaviorsof a robotic device, according to some embodiments.

DETAILED DESCRIPTION

Systems, apparatus, and methods are described herein for roboticlearning and execution of skills. In some embodiments, systems,apparatus, and methods described herein relate to a robotic apparatuscapable of learning skills via human demonstrations and interactions andexecuting learned skills in unstructured environments.

Overview

In some embodiments, systems, apparatus, and methods described hereinrelate to robots that can learn skills (e.g., manipulation skills) via aLearning from Demonstration (“LfD”) process, in which a humandemonstrates an action to the system via kinesthetic teaching (e.g. thehuman guides the robot through the action physically and/or by remotecontrol) and/or the human's own performance of the action. Such systems,apparatus, and methods do not require a robot to be pre-programmed withmanipulation skills, but rather the robot is designed to be adaptive andcapable of learning skills via observations. For example, a robot canuse machine-learning techniques to acquire and execute a manipulationskill. After learning a skill, the robot can execute the skill indifferent environments. The robot can learn and/or execute a skill basedon visual data (e.g., perceived visual information). Alternatively oradditionally, the robot can learn and/or execute the skill using hapticdata (e.g., torque, forces, and other non-visual information). Roboticlearning can take place at a factory prior to robot deployment, oronsite (e.g., at a hospital) after robot deployment. In someembodiments, a robot can be taught a skill and/or adapted to operate inan environment by users who are not trained in robotics and/orprogramming. For example, the robot can have a learning algorithm thatleverages natural human behavior, and can include tools that can guide auser through the demonstration process.

In some embodiments, a robot can be designed to interact with humans andcollaborate with humans to perform tasks. In some embodiments, a robotcan use common social behavior to act in a socially predictable andacceptable manner around humans. Robots that are mobile can also bedesigned to navigate within an environment while interacting with humansin that environment. For example, a robot can be programmed to voicecertain phrases to navigate around humans, to move aside to allow humansto pass, and to use eye gaze to communicate intentionality duringnavigation. In some embodiments, a robot can have sensors that enable itto perceive and track humans in the environment around it, and to usethat information to trigger eye gaze and other social behaviors.

In some embodiments, a robot can be designed to propose options forachieving a goal or performing an action during a LfD process. Forexample, a robot can propose a few different options for achieving agoal (e.g., picking up an object), and can indicate which of thoseoptions is most likely to be effective and/or efficient at achieving thegoal. In some embodiments, a robot can adapt a skill based on inputs bya user, e.g., a user indicating relevant features to include in a skillmodel.

In some embodiments, a robotic device can be capable of learning and/orexecuting skills in an unstructured environment, e.g., a dynamic and/orhuman environment, where the robotic device does not have completeinformation of the environment beforehand. Unstructured environments caninclude, for example, indoor and outdoor settings, and can include oneor more humans or other objects that are movable within the environment.Since most natural or real-world environments are unstructured, roboticdevices that can adapt and operate in an unstructured environment, suchas robotic devices and/or systems described herein, can offersignificant improvements over existing robotic devices that areincapable of adapting to an unstructured environment. Unstructuredenvironments can include indoor settings (e.g., buildings, offices,houses, rooms, etc.) and/or other types of enclosed spaces (e.g.,airplanes, trains, and/or other types of movable compartments), as wellas outdoor settings (e.g., parks, beaches, outside yards, fields). In anembodiment, robotic devices described herein can operate in anunstructured hospital environment.

FIG. 1 is a high-level block diagram that illustrates a system 100,according to some embodiments. System 100 can be configured to learn andexecute skills, such as, for example, manipulation skills in anunstructured environment. System 100 can be implemented as a singledevice, or be implemented across multiple devices that are connected toa network 105. For example, as depicted in FIG. 1, system 100 caninclude one or more compute devices, such as, for example, one or morerobotic devices 102 and 110, a server 120, and additional computedevice(s) 150. While four devices are shown, it should be understoodthat system 100 can include any number of compute devices, includingcompute devices not specifically shown in FIG. 1.

Network 105 can be any type of network (e.g., a local area network(LAN), a wide area network (WAN), a virtual network, atelecommunications network) implemented as a wired network and/orwireless network and used to operatively couple compute devices,including robotic devices 102 and 110, server 120, and compute device(s)150. As described in further detail herein, in some embodiments, forexample, the compute devices are computers connected to each other viaan Internet Service Provider (ISP) and the Internet (e.g., network 105).In some embodiments, a connection can be defined, via network 105,between any two compute devices. As shown in FIG. 1, for example, aconnection can be defined between robotic device 102 and any one ofrobotic device 110, server 120, or additional compute device(s) 150. Insome embodiments, the compute devices can communicate with each other(e.g., send data to and/or receive data from) and with the network 105via intermediate networks and/or alternate networks (not shown in FIG.1). Such intermediate networks and/or alternate networks can be of asame type and/or a different type of network as network 105. Eachcompute device can be any type of device configured to send data overthe network 105 to send and/or receive data from one or more of theother compute devices.

In some embodiments, system 100 includes a single robotic device, e.g.,robotic device 102. Robotic device 102 can be configured to perceiveinformation about an environment, learn skills via human demonstrationand interactions, interact with the environment and/or learnenvironmental constraints via human demonstrations and inputs, and/orexecute those skills in the environment. In some embodiments, roboticdevice 102 can engage in self-exploration and/or request user input tolearn additional information with respect to a skill and/or environment.A more detailed view of an example robotic device is depicted in FIG. 2.

In other embodiments, system 100 includes multiple robotic devices,e.g., robotic devices 102 and 110. Robotic device 102 can send and/orreceive data to and/or from robotic device 110 via network 105. Forexample, robotic device 102 can send information that it perceives aboutan environment (e.g., a location of an object) to robotic device 110,and can receive information about the environment from robotic device110. Robotic devices 102 and 110 can also send and/or receiveinformation to and/or from one another to learn and/or execute a skill.For example, robotic device 102 can learn a skill in an environment andsend a model representing that learned skill to robotic device 110, androbotic device 110, upon receiving that model, can use it to execute theskill in the same or a different environment. Robotic device 102 can bein a location that is the same as or different from robotic device 110.For example, robotic devices 102 and 110 can be located in the same roomof a building (e.g., a hospital building) such that they can learnand/or execute a skill together (e.g., moving a heavy or large object).Alternatively, robotic device 102 can be located on a first floor of abuilding (e.g., a hospital building), and robotic device 110 can belocated on a second floor of a building, and the two can communicatewith one another to relay information about the different floors to oneanother (e.g., where objects are located on those floors, where aresource may be, etc.).

In some embodiments, system 100 includes one or more robotic devices,e.g., robotic device 102 and/or 110, and a server 120. Server 120 can bea dedicated server that manages robotic device 102 and/or 110. Server120 can be in a location that is the same as or different from roboticdevice 102 and/or 110. For example, server 120 can be located in thesame building as one or more robotic devices (e.g., a hospitalbuilding), and be managed by a local administrator (e.g., a hospitaladministrator). Alternatively, server 120 can be located at a remotelocation (e.g., a location associated with a manufacturer or provider ofthe robotic device).

In some embodiments, system 100 includes one or more robotic devices,e.g., robotic device 102 and/or 110, and an additional compute device150. Compute device 150 can be any suitable processing device configuredto run and/or execute certain functions. In a hospital setting, forexample, a compute device 150 can be a diagnostic and/or treatmentdevice that is capable of connecting to network 105 and communicatingwith other compute devices, including robotic device 102 and/or 110.

In some embodiments, one or more robotic devices, e.g., robotic device102 and/or 110, can be configured to communicate via network 105 with aserver 120 and/or compute device 150. Server 120 can includecomponent(s) that are remotely situated from the robotic devices and/orlocated on premises near the robotic devices. Compute device 150 caninclude component(s) that are remotely situated from the roboticdevices, located on premises near the robotic devices, and/or integratedinto a robotic device. Server 120 and/or compute device 150 can includea user interface that enables a user, referred to as a robot supervisor,to control the operation of the robotic devices. For example, the usercan interrupt and/or modify the execution of one or more actionsperformed by the robotic devices. These actions can include, forexample, navigation behaviors, manipulation behaviors, head behaviors,sounds/lights, and/or other components of a robotic device. In someembodiments, the robot supervisor can remotely monitor the roboticdevices and control their operation for safety reasons. For example, therobot supervisor can command a robotic device to stop or modify anexecution of an action to avoid endangering a human or causing damage tothe robotic device or another object in an environment. In someembodiments, a robotic device can be configured to solicit userintervention at specific points during its execution of an action. Forexample, the robotic device can solicit user intervention at points whenthe robotic device cannot confirm certain information about itselfand/or environment around itself, when the robotic device cannotdetermine a trajectory for completing an action or navigating to aspecific location, when the robotic device has been programmed inadvance to solicit user input (e.g., during learning using aninteractive learning template, as further described below), etc. In someembodiments, a robotic device can request feedback from a user atspecific points during learning and/or execution. For example, therobotic device can prompt a user to specify what information it shouldcollect (e.g., information associated with a manipulation element or atransport element, information associated with the surroundingenvironment) and/or when to collect information (e.g., the timing ofkeyframes during a demonstration). Alternatively or additionally, therobotic device can request that a user tag the robotic device's past orcurrent behavior in a specific context as a positive or negative exampleof an action. The robotic device can be configured to use thisinformation to improve future execution of the action in the specificcontext.

Systems and Devices

FIG. 2 schematically illustrates a robotic device 200, according to someembodiments. Robotic device 200 includes a control unit 202, a userinterface 240, at least one manipulating element 250, and at least onesensor 270. Additionally, in some embodiments, robotic device 200optionally includes at least one transport element 260. Control unit 202includes a memory 220, a storage 230, a processor 204, a system bus 206,and at least one input/output interface (“I/O interface”) 208. Memory220 can be, for example, a random access memory (RAM), a memory buffer,a hard drive, a database, an erasable programmable read-only memory(EPROM), an electrically erasable read-only memory (EEPROM), a read-onlymemory (ROM), and/or so forth. In some embodiments, memory 220 storesinstructions that cause processor 204 to execute modules, processes,and/or functions associated with scanning or viewing an environment,learning a skill, and/or executing a skill. Storage 230 can be, forexample, a hard drive, a database, a cloud storage, a network-attachedstorage device, or other data storage device. In some embodiments,storage 230 can store, for example, sensor data including stateinformation regarding one or more components of robotic device 200(e.g., manipulating element 250), learned models, marker locationinformation, etc.

Processor 204 of control unit 202 can be any suitable processing deviceconfigured to run and/or execute functions associated with viewing anenvironment, learning a skill, and/or executing a skill. For example,processor 204 can be configured to generate a model for a skill based onsensor information, or execute a skill by generating, using a model, atrajectory for performing a skill, as further described herein. Morespecifically, processor 204 can be configured to execute modules,functions, and/or processes. In some embodiments, processor 204 can be ageneral purpose processor, a Field Programmable Gate Array (FPGA), anApplication Specific Integrated Circuit (ASIC), a Digital SignalProcessor (DSP), and/or the like.

System bus 206 can be any suitable component that enables processor 204,memory 220, storage 230, and/or other components of control unit 202 tocommunicate with each other. I/O interface(s) 208, connected to systembus 206, can be any suitable component that enables communicationbetween internal components of control unit 202 (e.g., processor 204,memory 220, storage 230) and external input/output devices, such as userinterface 240, manipulating element(s) 250, transport element(s) 260,and sensor(s) 270.

User interface 240 can include one or more components that areconfigured to receive inputs and send outputs to other devices and/or auser operating a device, e.g., a user operating robotic device 200. Forexample, user interface 240 can include a display device 242 (e.g., adisplay, a touch screen, etc.), an audio device 244 (e.g., a microphone,a speaker), and optionally one or more additional input/output device(s)(“I/O device(s)”) 246 configured for receiving an input and/orgenerating an output to a user.

Manipulating element(s) 250 can be any suitable component that iscapable of manipulating and/or interacting with a stationary and/ormoving object, including, for example, a human. Manipulating element(s)250 can include a plurality of segments that are coupled to one anothervia joints that can provide for translation along and/or rotation aboutone or more axes. Manipulating element(s) 250 can optionally include anend effector that can engage with and/or otherwise interact with objectsin an environment. For example, manipulating element can include agripping mechanism that can releasably engage (e.g., grip) objects inthe environment to pick up and/or transport the objects. Other examplesof end effectors include, for example, vacuum engaging mechanism(s),magnetic engaging mechanism(s), suction mechanism(s), and/orcombinations thereof. In some embodiments, one or more manipulatingelement(s) 250 can be retractable into a housing of the robotic device200 when not in use to reduce one or more dimensions of the roboticdevice. In some embodiments, manipulating element(s) 250 can include ahead or other humanoid component configured to interact with anenvironment and/or one or more objects within the environment, includinghumans. A detailed view of an example manipulating element is depictedin FIG. 4.

Transport element(s) 260 can be any suitable components configured formovement such as, for example, a wheel or a track. One or more transportelement(s) 260 can be provided on a base portion of robotic device 200to enable robotic device 200 to move around an environment. For example,robotic device 200 can include a plurality of wheels that enable it tonavigate around a building, such as, for example, a hospital. Transportelement(s) 260 can be designed and/or dimensioned to facilitate movementthrough tight and/or constrained spaces (e.g., small hallways andcorridors, small rooms such as supply closets, etc.). In someembodiments, transport element(s) 260 can be rotatable about an axisand/or movable relative to one another (e.g., along a track). In someembodiments, one or more transport element(s) 260 can be retractableinto a base of the robotic device 200 when not in use to reduce one ormore dimensions of the robotic device.

Sensor(s) 270 can be any suitable component that enables robotic device200 to capture information about the environment and/or objects in theenvironment around robotic device 200. Sensor(s) 270 can include, forexample, image capture devices (e.g., cameras, such as ared-green-blue-depth (RGB-D) camera or a webcam), audio devices (e.g.,microphones), light sensors (e.g., light detection and ranging or lidarsensors, color detection sensors), proprioceptive sensors, positionsensors, tactile sensors, force or torque sensors, temperature sensors,pressure sensors, motion sensors, sound detectors, etc. For example,sensor(s) 270 can include at least one image capture device such as acamera for capturing visual information about objects and theenvironment around robotic device 200. In some embodiments, sensor(s)270 can include haptic sensors, e.g., sensors that can convey forces,vibrations, touch, and other non-visual information to robotic device200.

In some embodiments, robotic device 200 can be have humanoid features,e.g., a head, a body, arms, legs, and/or a base. For example, roboticdevice 200 can include a face with eyes, a nose, a mouth, and otherhumanoid features. These humanoid feature can form and/or be part of oneor more manipulating element(s). While not schematically depicted,robotic device 200 can also include actuators, motors, couplers,connectors, power sources (e.g., an onboard battery), and/or othercomponents that link, actuate, and/or drive different portions ofrobotic device 200.

FIG. 3 is a block diagram that schematically illustrates a control unit302, according to some embodiments. Control unit 302 can include similarcomponents as control unit 202, and can be structurally and/orfunctionally similar to control unit 202. For example, control unit 302includes a processor 304, a memory 320, I/O interface(s) 308, a systembus 306, and a storage 330, which can be structurally and/orfunctionally similar to processor 204, memory 220, I/O interface(s) 208,system bus 206, and storage 230, respectively.

Memory 320 stores instructions that can cause processor 304 to executemodules, processes, and/or functions, illustrated as active scanning322, marker identification 324, learning and model generation 326,trajectory generation and execution 328, and success monitoring 329.Active scanning 322, marker identification 324, learning and modelgeneration 326, trajectory generation and execution 328, and successmonitoring 329 can be implemented as one or more programs and/orapplications that are tied to hardware components (e.g., a sensor, amanipulating element, an I/O device, a processor, etc.). Active scanning322, marker identification 324, learning and model generation 326,trajectory generation and execution 328, and success monitoring 329 canbe implemented by one robotic device or multiple robotic devices. Forexample, a robotic device can be configured to implement active scanning322, marker identification 324, and trajectory generation and execution328. As another example, a robotic device can be configured to implementactive scanning 322, marker identification 324, optionally learning andmodel generation 326, and trajectory generation and execution 328. Asanother example, a robotic device can be configured to implement activescanning 322, marker identification 324, trajectory generation andexecution 328, and optionally success monitoring 329. While notdepicted, memory 320 can also store programs and/or applicationsassociated with an operating system, and general robotic operations(e.g., power management, memory allocation, etc.).

Storage 330 stores information relating to skill learning and/orexecution. Storage 330 stores, for example, internal state information331, model(s) 334, object information 340, and machine learninglibraries 342. Internal state information 331 can include informationregarding a state of a robotic device (e.g., robotic device 200) and/oran environment in which the robotic device is operating (e.g., abuilding, such as, for example, a hospital). In some embodiments, stateinformation 331 can indicate a location of the robotic device within theenvironment, such as, for example, a room, a floor, an enclosed space,etc. For example, state information 331 can include a map 332 of theenvironment, and indicate a location of the robotic device within thatmap 332. State information 331 can also include the location(s) of oneor more objects (or markers representing and/or associated with objects)within the environment, e.g., within map 332. Thus, state information331 can identify a location of a robotic device relative to one or moreobjects. Objects can include any type of physical object that is locatedwithin the environment, including objects that define a space or anopening (e.g., surfaces or walls that define a doorway). Objects can bestationary or mobile. Examples of objects in an environment, such as,for example, a hospital, include equipment, supplies, instruments,tools, furniture, and/or humans (e.g., nurses, doctors, patients, etc.).

In some embodiments, state information 331 can include a representationor map of the environment, such as depicted in FIG. 19. The map of theenvironment can include, for example, a navigation layer, a staticsemantic layer, a social layer, and a dynamic layer. Information learnedby a robotic device, e.g., from demonstrations, user input, perceivedsensor information, etc., can feed into the different layers of the mapand be organized for future reference by the robotic device (and/orother robotic devices). For example, a robotic device may rely oninformation learned about different objects within an environment (e.g.,a door) to decide how to arbitrate between different behaviors (e.g.,waiting for a door to a tight doorway to be opened before going throughthe doorway, seeking assistance to open a door before going through atight doorway), as further described herein.

Object information 340 can include information relating to physicalobject(s) in an environment. For example, object information can includeinformation identifying or quantifying different features of an object,such as, for example, location, color, shape, and surface features.Object information can also identify codes, symbols, and other markersthat are associated with a physical object, e.g., Quick Response or “QR”codes, barcodes, tags, etc. In some embodiments, object information caninclude information characterizing an object within the environment,e.g., a doorway or hallway as being tight, a door handle as being a typeof door handle, etc. Object information can enable control unit 302 toidentify physical object(s) in the environment.

State information 331 and/or object information 340 can be examples ofenvironmental constraints, as used and described herein. Environmentalconstraints can include information regarding an environment that can bepresented or viewed in various dimensions. For example, environmentalconstraints can vary across location, time, social interactions,specific contexts, etc. A robotic device can learn environmentalconstraints from user inputs, demonstrations of a skill, interactionswith an environment, self-exploration, execution of skills, etc.

Machine learning libraries 342 can include modules, processes, and/orfunctions relating to different algorithms for machine learning and/ormodel generation of different skills. In some embodiments, machinelearning libraries can include methods such as Hidden Markov Models or“HMMs.” An example of an existing machine learning library in Python isscikit-learn. Storage 330 can also include additional software librariesrelating to, for example, robotics simulation, motion planning andcontrol, kinematics teaching and perception, etc.

Model(s) 334 are models that have been generated for performingdifferent actions, and represent skills that have been learned by arobotic device. In some embodiments, each model 334 is associated with aset of markers that are tied to different physical objects in anenvironment. Marker information 335 can indicate which markers areassociated with a particular model 334. Each model 334 can also beassociated with sensory information 336 that is collected, e.g., via oneor more sensors of a robotic device, during kinesthetic teaching and/orother demonstrations of a skill. Sensory information 336 can optionallyinclude manipulating element information 337 associated with amanipulating element of a robotic device as it performs an action duringa demonstration. Manipulating element information 337 can include, forexample, joint configurations, end effector positions andconfigurations, and/or forces and torques acting on joints and/or endeffectors. Manipulating element information 337 can be recorded atspecific points during a demonstration and/or execution of a skill(e.g., keyframes), or alternatively, throughout a demonstration and/orexecution of a skill. Sensory information 336 can also includeinformation associated with an environment in which a skill isdemonstrated and/or executed, e.g., location of markers in theenvironment. In some embodiments, each model 334 can be associated withsuccess criteria 339. Success criteria 339 can be used to monitor theexecution of a skill. In some embodiments, success criteria 339 caninclude information associated with visual and haptic data that areperceived using one or more sensors, e.g., cameras, force/torquesensors, etc. Success criteria 339 can be, for example, tied to visuallydetecting movement of an object, sensing a force that is acting on acomponent of the robotic device (e.g., a weight from an object), sensingan engagement between a component of the robotic device and an object(e.g., a change in pressure or force acting on a surface), etc. Examplesof using haptic data in robotic learning of manipulation skills isdescribed in the article entitled “Learning Haptic Affordances fromDemonstration and Human-Guided Exploration,” authored by Chu et. al.,published in 2016 IEEE Haptics Symposium (HAPTICS), Philadelphia, Pa.,2016, pp. 119-125, accessible athttp://ieeexplore.ieee.org/document/7463165/, incorporated herein byreference. Examples of using visual data in robotic learning ofmanipulating skills is described in the article entitled “SimultaneouslyLearning Actions and Goals from Demonstration,” authored by Akgun etal., published in Autonomous Robots, Volume 40, Issue 2, February 2016,pp. 211-227, accessible at https://doi.org/10.1007/s10514-015-9448-x(“Akgun article”), incorporated herein by reference.

Optionally, sensory information 336 can include transport elementinformation 338. Transport element information 338 can be associatedwith movements of a transport element (e.g., transport element(s) 260)of a robotic device as the robotic device undergoes a demonstration of askill (e.g., navigation through a doorway, transport of an object,etc.). Transport element information 338 can be recorded at specificpoints during a demonstration and/or execution of a skill, such as atkeyframes associated with the skill, or alternatively, through ademonstration and/or execution of a skill.

In some embodiments, an initial set of environmental constraints (e.g.,state information 331, object information 340, etc.) and/or skills(e.g., model(s) 334) can be provided to a robotic device, e.g., via aremote administrator or supervisor. The robotic device can adapt and/oradd to its knowledge of environmental constraints and/or skills based onits own interactions, demonstrations, etc. with an environment and/orvia additional user input. Alternatively or additionally, a robotsupervisor can update the robotic device's knowledge of environmentalconstraints and/or skills based on new information collected by therobotic device or other robotic device(s) (e.g., other robotic device(s)within similar or the same environment, e.g., a hospital) and/orprovided to the robotic supervisor by external parties (e.g., suppliers,administrators, manufacturers, etc.). Such updates can be periodicallyand/or continuously provided, as new information about an environment orskill is provided to the robotic device and/or robot supervisor.

FIG. 4 schematically illustrates a manipulating element 350, accordingto some embodiments. Manipulating element 350 can form a part of arobotic device, such as, for example, robotic device 102 and/or 200.Manipulating element 350 can be implemented as an arm that includes twoor more segments 352 coupled together via joints 354. Joints 354 canallow one or more degrees of freedom. For example, joints 354 canprovide for translation along and/or rotation about one or more axes. Inan embodiment, manipulating element 350 can have seven degrees offreedom provided by joints 354. While four segments 352 and four joints354 are depicted in FIG. 4, one of ordinary skill in the art wouldunderstand that a manipulating element can include a different number ofsegments and/or joints.

Manipulating element 350 includes an end effector 356 that can be usedto interact with objects in an environment. For example, end effector356 can be used to engage with and/or manipulate different objects.Alternatively or additionally, end effector 356 can be used to interactwith movable or dynamic objects, including, for example, humans. In someembodiments, end effector 356 can be a gripper that can releasablyengage or grip one or more objects. For example, end effector 356implemented as a gripper can pick up and move an object from a firstlocation (e.g., a supply closet) to a second location (e.g., an office,a room, etc.).

A plurality of sensors 353, 355, 357, and 358 can be disposed ondifferent components of manipulating element 350, e.g., segments 352,joints 354, and/or end effector 356. Sensors 353, 355, 357, and 358 canbe configured to measure sensory information, including environmentalinformation and/or manipulating element information. Examples of sensorsinclude position encoders, torque and/or force sensors, touch and/ortactile sensors, image capture devices such as cameras, temperaturesensors, pressure sensors, light sensors, etc. In some embodiments,sensor 353 disposed on a segment 352 can be a camera that is configuredto capture visual information about an environment. In some embodiments,sensor 353 disposed on a segment 352 can be an accelerometer configuredto enable measurement of an acceleration, and/or calculation of speed ofmovement, and/or a position, of segment 352. In some embodiments, sensor355 disposed on a joint 354 can be a position encoder configured tomeasure a position and/or configuration of joint 354. In someembodiments, sensor 355 disposed on a joint 354 can be a force or torquesensor configured to measure a force or torque applied to joint 354. Insome embodiments, sensor 358 disposed on end effector 356 can be aposition encoder and/or a force or torque sensor. In some embodiments,sensor 357 disposed on end effector 356 can be a touch or tactile sensorconfigured to measure an engagement between end effector 356 and anobject in the environment. Alternatively or additionally, one or more ofsensors 353, 355, 357, and 358 can be configured to record informationabout one or more objects and/or markers in the environment. Forexample, sensor 358 disposed on end effector 356 can be configured totrack a location of an object in the environment and/or a position ofthe object relative to end effector 356. In some embodiments, one ormore of sensors 353, 355, 357, and 358 can also track whether an object,such as a human, has moved in the environment. Sensors 353, 355, 357,and 358 can send the sensory information that they record to a computedevice located on a robotic device (e.g., an onboard control unit suchas, for example, control unit 202 and/or 302), or sensors 353, 355, 357,and 358 can send the sensory information to a remote compute device(e.g., a server such as, for example, server 120).

Manipulating element 350 can optionally include a coupling element 359that enables manipulating element 350 to be releasably coupled to arobotic device, such as any of the robotic devices described herein. Insome embodiments, manipulating element 350 can be coupled to a fixedlocation of the robotic device and/or be capable of being coupled tomultiple locations of the robotic device (e.g., a right side or a leftside of a body of robotic device, as shown in FIG. 5). Coupling element359 can include any type of mechanism that can couple manipulatingelement 350 to the robotic device, such as, for example, a mechanicalmechanism (e.g., a fastener, a latch, a mount), a magnetic mechanism, afriction fit, etc.

FIG. 20 is a block diagram that schematically illustrates a control unit1702 of a robotic system, according to some embodiments. The controlunit 1702 can include similar components as other control unitsdescribed herein (e.g., control units 202 and/or 302). For example, thecontrol unit 1702 includes a processor 1704, a memory 1720, I/Ointerface(s) 1708, a system bus 1706, and a storage 1730, which can bestructurally and/or functionally similar to the processor, memory, I/Ointerface(s), system bus, and storage of control units 202 and/or 302,respectively. Control unit 1702 can be located on a robotic deviceand/or at a remote server that is connected to one or more roboticdevices.

Memory 1720 stored instructions that can cause processor 1704 to executemodules, processes, and/or functions, including active sensing 1722,skill and behavior learning 1724, and action execution 1726, andoptionally resource arbitration 1728 and data tracking & analytics 1758.Active sensing 1722, skill and behavior learning 1724, action execution1726, resource arbitration 1728, and data tracking & analytics 1758 canbe implemented as one or more programs and/or applications that are tiedto hardware components (e.g., a sensor, a manipulating element, a I/Odevice, a processor, etc.). Active sensing 1722, skill and behaviorlearning 1724, action execution 1726, resource arbitration 1728, anddata tracking & analytics 1758 can be implemented by one robotic deviceor multiple robotic devices. In an embodiment, active sensing 1722 caninclude active scanning of an environment, as described herein. In otherembodiments, active sensing 1722 can include scanning of an environmentand/or sensing or perceiving information associated with theenvironment, object(s) within the environment (e.g., including humanswithin the environment), and/or one or more conditions associated with arobotic device or system.

Similar to storage 330, storage 1730 stores information relating to anenvironment and/or objects within the environment, and learning and/orexecution of skills (e.g., tasks and/or social behaviors). Storage 1730stores, for example, state information 1732, skill model(s) 1734, objectinformation 1740, machine learning libraries 1742, and/or environmentalconstraint(s) 1754. Optionally, storage 1730 can also store trackedinformation 1756 and/or arbitration algorithm(s) 1758.

State information 1732 includes information regarding a state of arobotic device (such as any of the robotic devices described herein)and/or an environment in which the robotic device is operating. In someembodiments, state information 1732 can include a map of theenvironment, along with additional static and/or dynamic informationassociated with objects within the environment. For example, stateinformation 1732 can include a navigational map of a building, alongwith static and/or dynamic information regarding objects (e.g.,supplies, equipment, etc.) within the building and social contextinformation associated with humans and/or social settings within thebuilding. FIG. 19 provides a schematic view of an example map orrepresentation 1600 of a building. Representation 1600 includes anavigation layer 1610, a static semantic layer 1620, a social layer1630, and a dynamic layer 1640. Navigation layer 1610 provides a generallayout or map of the building, which may identify a number of floor(s)1612 with wall(s) 1614, stair(s) 1616, and other elements built into thebuilding (e.g., hallways, openings, boundaries). Static semantic layer1620 identifies objects and/or spaces within the building, such asroom(s) 1622, object(s) 1624, door(s) 1626, etc. Static semantic layer1620 can identify which room(s) 1622 or other spaces are accessible ornot accessible to a robotic device. In some embodiments, static semanticlayer 1620 can provide a three dimensional map of the objects locatedwithin a building. Social layer 1630 provides social context information1632. Social context information 1632 includes information associatedwith humans within the building, such as past interactions betweenrobotic device(s) and human(s). Social context information 1632 can beused to track interactions between robotic device(s) and human(s), whichcan be used to generate and/or adapt existing models of skills involvingone or more interactions between a robotic device and a human. Forexample, social context information 1632 can indicate that a human istypically located at a particular location, such that a robotic devicehaving knowledge of that information can adapt its execution of a skillthat would require the robotic device to move near the location of thehuman. Dynamic layer 1640 provides information on object(s) and otherelements within a building that may move and/or change over time. Forexample, dynamic layer 1640 can track movement(s) 1644 and/or change(s)1646 associated with object(s) 1642. In an embodiment, dynamic layer1640 can monitor the expiration date of an object 1642 and identify whenthat object 1642 has expired.

Representation 1600 can be accessible to and/or managed by a controlunit 1602, e.g., of a robotic device. Control unit 1602 can includesimilar components as other control units described herein (e.g.,control units 202, 302, and/or 1702). Control unit 1602 can include astorage (similar to other storage elements described herein, such as,for example, storage 330 and/or 1730) that stores state information1604, including representation 1600 as well as information associatedwith one or more robotic devices (e.g., a configuration of an element ofa robotic device, a location of the robotic device in a building, etc.).Control unit 1602 can be located on a robotic device and/or at a remoteserver that is connected to one or more robotic devices. Roboticdevice(s) can be configured to update and maintain state information1604, including information associated with representation 1600 of thebuilding, as the robotic device(s) collect information on theirsurrounding environment.

In some embodiments, control unit 1602 can also optionally include adata tracking & analytics element 1606. Data tracking & analyticselement 1606 can be, for example, a computing element (e.g., aprocessor) configured to perform data tracking and/or analytics of theinformation collected by one or more robotic devices, e.g., informationcontained within representation 1600 and/or other state information1604. For example, data tracking & analytics element 1606 can beconfigured to manage inventory, e.g., tracking expiration dates,monitoring and recording the use of inventory items, ordering newinventory items, analyzing and recommending new inventory items, etc. Ina hospital setting, data tracking & analytics element 1606 can managethe use and/or maintenance of medical supplies and/or equipment. In someembodiments, data tracking & analytics element 1606 can generateaggregate data displays (e.g., reports, charts, etc.), which can be usedto comply with formal laws, regulations, and/or standards. In someembodiments, data tracking & analytics element 1606 can be configured toanalyze information associated with humans, such as patients within ahospital. For example, a robotic device can be configured to collectinformation on patient(s) within a hospital and to pass that informationto data tracking & analytics element 1606 to analyze and/or summarizefor use in various functions, including, for example, diagnostic testsand/or screening. In some embodiments, data tracking & analytics element1606 can be configured to access and/or obtain information fromthird-party systems (e.g., hospital electronic medical records, securitysystem data, insurance data, vendor data, etc.), and use and/or analyzethat data, with or without data collected by one or more roboticdevices, to perform data tracking and/or analytics functions.

As depicted in FIG. 20, skill model(s) 1734 are models that can be usedto learn and/or execute various actions or skills, including tasksand/or behaviors. Model(s) 1734 can be similar to model(s) 334, asdescribed herein with reference to FIG. 3. For example, model(s) 1734can include information associated with object(s) that are involved inthe execution of a skill (e.g., an object that is manipulated by therobotic device, an object that a robotic device interacts with duringexecution of a skill, an object that a robotic device takes into accountwhile executing a skill). As noted above, examples of objects includestationary and/or mobile objects, such as, for example, supplies,equipment, humans, and/or surfaces that define openings (e.g.,doorways). The information associated with the object(s) can include,for example, markers that identify an object and/or features of anobject. Additionally or alternatively, model(s) 1734 can include sensoryinformation that is collected by a robotic device, e.g., during learningand/or execution of a skill or active scanning. As described above,sensory information can include information associated with one or morecomponents of a robotic device (e.g., a manipulating element, atransport element), as that robotic device learns and/or executes askill, and/or information associated with an environment in which askill is learned and/or executed (e.g., location of object(s) within theenvironment, social context information, etc.). In some embodiments, amodel 1734 can be associated with success criteria, such as, forexample, visual and/or haptic data perceived using one or more sensorsof a robotic device that indicates a successful execution of a skill.

Object information 1740 can include information relating to physicalobject(s) (e.g., location, color, shape, surface features, and/oridentification codes). Machine learning libraries 1742 can includemodules, processes, and/or functions relating to different algorithmsfor machine learning and/or model generation of different skills.

Environmental constraints 1754 include information associated withobjects and/or conditions within an environment that may restrict theoperation of a robotic device within the environment. For example,environmental constraints 1754 can include information associated withthe size, configuration, and/or location of objects within anenvironment (e.g., supply bin, room, doorway, etc.), and/or informationthat indicates that certain areas (e.g., a room, a hallway, etc.) haverestricted access. Environmental constraints 1754 may affect thelearning and/or execution of one or more actions within an environment.As such, an environmental constraint 1754 may become part of each modelfor a skill that is executed within a context including theenvironmental constraint.

Tracked information 1756 includes information associated with arepresentation of an environment (e.g., representation 1600) and/orinformation that is obtained from third-party systems (e.g., hospitalelectronic medical records, security system data, insurance data, vendordata, etc.) that is tracked and/or analyzed, e.g., by a data tracking &analytics element, such as data tracking & analytics element 1606 orprocessor 1704 executing data tracking & analytics 1758. Examples oftracked information 1756 include inventory data, supply chain data,point-of-use data, and/or patient data, as well as any aggregate datacompiled from such data.

Arbitration algorithm(s) 1758 include algorithms for arbitrating orselecting between different actions to execute (e.g., how to usedifferent resources or components of a robotic device). Arbitrationalgorithm(s) 1758 can be rules and, in some embodiments, can be learned,as further described with reference to FIG. 24. These algorithms can beused by a robotic device to select between different actions when therobotic device has multiple resources and/or objectives to manage. Forexample, a robotic device operating in an unstructured and/or dynamicenvironment, e.g., an environment including humans, may be exposed to anumber of conditions at any point in time that may demand differentbehaviors or actions from the robotic device. In such instances, therobotic device can be configured to select between the different actionsbased on one or more arbitration algorithms 1758, which may assigndifferent priorities to the various actions based on predefined rules(e.g., affective states, environmental constraints, socially definedconstraints, etc.). In an embodiment, an arbitration algorithm 1758 canassign different scores or values to actions based on informationcollected by a robotic device about its surrounding environment, itscurrent state, and/or other factors.

Similar to I/O interface(s) 208 and/or 308, I/O interface(s) 1708 can beany suitable component(s) that enable communication between internalcomponents of control unit 1702 and external devices, such as a userinterface, a manipulating element, a transport element, and/or computedevice. I/O interface(s) 1708 can include a network interface 1760 thatcan connect control unit 1702 to a network (e.g., network 105, asdepicted in FIG. 1). Network interface 1760 enables communicationsbetween control unit 1702 (which can be located on a robotic device oranother network device in communication with one or more roboticdevices) and a remote device, such as a compute device that can be usedby a robot supervisor to monitor and/or control one or more roboticdevices. Network interface 1760 can be configured to provide a wirelessand/or wired connection to a network.

FIG. 5 schematically illustrates a robotic device 400, according to someembodiments. Robotic device 400 includes a head 480, a body 488, and abase 486. Head 480 can be connected to body 488 via a segment 482 andone or more joints (not depicted). Segment 482 can be movable and/orflexible to enable head 480 to move relative to body 488. Head 480,segment 482, etc. can be examples of manipulating element(s), andinclude similar functionality and/or structure as other manipulatingelement(s) described herein.

Head 480 includes one or more image capture devices 472 and/or othersensors 470. Image capture device 472 and/or other sensors 470 (e.g.,lidar sensors, motion sensors, etc.) can enable robotic device 400 toscan an environment and obtain a representation (e.g., a visualrepresentation or other semantic representation) of the environment. Insome embodiments, image capture device 472 can be a camera. In someembodiments, image capture device 472 can be movable such that it can beused to focus on different areas of the environment around roboticdevice 400. Image capture device 472 and/or other sensors 470 cancollect and send sensory information to a compute device or processoronboard robotic device 400, such as, for example, control unit 202 or302. In some embodiments, head 480 of robotic device 400 can have ahumanoid shape, and include one or more human features, e.g., eyes,nose, mouth, ears, etc. In such embodiments, image capture device 472and/or other sensors 470 can be implemented as one or more humanfeatures. For example, image capture device 472 can be implemented aseyes on head 480.

In some embodiments, robotic device 400 can use image capture device 472and/or other sensors 470 to scan an environment for information aboutobjects in the environment, e.g., physical structures, devices,articles, humans, etc. Robotic device 400 can engage in active scanning,or robotic device 400 can initiate scanning in response to a trigger(e.g., an input from a user, a detected event or change in theenvironment).

In some embodiments, robotic device 400 can engage in adaptive scanningwhere scanning can be performed based on stored knowledge and/or a userinput. For example, robotic device 400 can identify an area in theenvironment to scan for an object based on prior information that is hason the object. Referring to FIG. 6A, robotic device 400 can scan a scene(e.g., an area of a room) and obtain a representation 500 of the scene.In representation 500, robotic device 400 identifies that a first object550 is located in an area 510 and that a second object 560 is located inareas 510 and 530. Robotic device 400 can store the locations of objects550 and 560 in a map of the environment that it stored internally, suchthat robotic device 400 can use that information to locate objects 550and 560 when performing a future scan. For example, when robotic device400 returns to the scene and scans the scene a second time, roboticdevice 400 may obtain a different view of the scene, as shown in FIG.6B. When performing this second scan, robotic device 400 can obtain arepresentation 502 of the scene. To locate objects 550 and 560 inrepresentation 502, robotic device 400 can refer to information that ithad previously stored about the locations of those objects when it hadobtained representation 500 of the scene. Robotic device 400 can takeinto account that its own location in the environment may have changed,and recognize that objects 550 and 560 may be located in different areasof representation 502. Based on this information, robotic device 400 mayknow to look in area 510 for object 550 but to look in areas 520 and 540for object 560. Robotic device 400, by using previously storedinformation about the locations of objects 550 and 560, canautomatically identify areas to scan closely (e.g., by zooming in, byslowly moving a camera through those areas) for objects 550 and 560.

In some embodiments, robotic device 400 can also know to scan differentareas of a scene more closely based on an input by a human. For example,a human can indicate to robotic device 400 that a certain area of ascene includes one or more objects of interest, and robotic device 400can scan those areas more closely to identify those objects. In suchembodiments, robotic device 400 can include an input/output device 440,such as a display with a keyboard or other input device, and/or atouchscreen, as schematically depicted in FIG. 5.

In some embodiments, robotic device 400 can scan an environment andidentify that an object, such as, for example, a human, is moving in theenvironment. For example, as shown in FIGS. 7A and 7B, an object 660 canbe moving in an environment while an object 650 remains stationary. FIG.7A depicts a representation 600 of a scene, showing object 660 in areas610 and 630, and FIG. 7B depicts representation 602 of the scene,showing object 660 in areas 620 and 640. In both representations 600 and602, object 650 can remain in the same location in area 610. Roboticdevice 400 can identify that object 660 has moved in the scene andadjust its actions accordingly. For example, if robotic device 400 hadplans to interact with object 660, robotic device 400 may change itstrajectory, e.g., move closer to object 660 and/or change a trajectoryof a manipulating element or other component that is configured tointeract with object 660. Alternatively or additionally, if roboticdevice 400 had plans to interact with object 650 (and/or another objectin the scene), robotic device 400 can take into account the movement ofobject 660 while planning its course for interacting with object 650. Insome embodiments, robotic device 400 can engage in active scanning suchthat it can adjust its actions in near real-time.

As schematically depicted in FIG. 5, base 486 optionally can include oneor more transport elements implemented as wheels 460. Wheels 460 canenable robotic device 400 to move around an environment, e.g., ahospital. Robotic device 400 also includes at least one manipulatingelement implemented as arms 450. Arms 450 can be structurally and/orfunctionally similar to other manipulating elements described herein,e.g., manipulating element 350. Arms 450 can be fixedly attached to body488 of robotic device 400, or optionally, manipulating element 450 canbe releasably coupled to body 488 via a coupling element (e.g. couplingelement 359) that can attach to a coupling portion 484 of robotic device400. Coupling portion 484 can be configured to engage with couplingelement 359, and provide an electrical connection between arm 450 and anonboard compute device (e.g., control unit 202 or 302), such that theonboard compute device can power and/or control components of arm 450,and receive information collected by sensors disposed on manipulatingelement 450 (e.g., sensors 353, 355, 357, and 358).

Optionally, robotic device 400 can also include one or more additionalsensor(s) 470 located on segment 482, body 488, base 486, and/or otherparts of robotic device 400. Sensor(s) 470 can be, for example, imagecapture devices, force or torque sensors, motion sensors, light sensors,pressure sensors, and/or temperature sensors. Sensors 470 can enablerobotic device 400 to capture visual and non-visual information aboutthe environment.

Methods

FIGS. 8-11 are flow diagrams illustrating a method 700 that can beperformed by a robotic system (e.g., robotic system 100) including oneor more robotic devices, according to some embodiments. For example, allor a part of method 700 can be performed by one robotic device, such asany of the robotic devices described herein. Alternatively, all ofmethod 700 can be performed sequentially by multiple robotic devices,each performing, in turn, a part of method 700. Alternatively, all or apart method 700 can be performed concurrently by multiple roboticdevices.

As depicted in FIG. 8, a robotic device can scan an environment andobtain a representation of the environment, at 702. The robotic devicecan scan the environment using one or more sensors (e.g., sensor(s) 270or 470, and/or image capture device(s) 472). In some embodiments, therobotic device can scan the environment using a movable camera, wherethe position and/or focus of the camera can be adjusted to capture areasin a scene of the environment. At 704, based on the informationcollected during the scanning, the robotic device can analyze the datato identify marker(s) in the captured representation of the environment.The markers can be associated with one or more objects in the scene thathave been marked using visual or fiducial markers, e.g., a visiblemarker such as a QR code, a barcode, a tag, etc. Alternatively oradditionally, the robotic device can identify markers associated withone or more objects in the environment via object recognition usingobject information (e.g., object information 340) that is stored in amemory on the robotic device (e.g., storage 330). Object information caninclude, for example, information indicating different features of anobject, such as location, color, shape, and surface features. In anembodiment, object information can be organized as numerical values thatrepresent different features of an object, which can be referred to as afeature space.

After identifying the marker(s), the robotic device can optionallypresent the marker(s) in a representation of the environment, at 706. Insome embodiments, the representation of the environment can be a visualrepresentation such as, for example, an augmented view of theenvironment. In such embodiments, the robotic device can display thevisual representation of the environment, e.g., on a display screen, anddisplay the locations of the marker(s) in the visual representation ofthe environment. Alternatively or additionally, the representation ofthe environment can be a semantic representation of the environment,with the locations of the marker(s), represented by semantic markers, inthe environment.

In some embodiments, the robotic device can present the representationof the environment with the marker(s) to a user, and optionally promptthe user, e.g., via a user interface or other type of I/O device, toaccept or reject the marker(s) in the representation of the environment,at 708. If the user does not accept the marker(s) (708: NO), then method700 returns to 702, and the robotic device can rescan the environment toobtain a second representation of the environment. If the user acceptsthe marker(s) (708: YES), then method 700 proceeds to 708, where therobotic device can store information associated with the markers (e.g.,location, features, etc.) in a memory (e.g., storage 330). For example,the robotic device can store the location of the marker(s) in aninternal map of the environment (e.g., map 332).

In some embodiments, the robotic device can identify the marker(s) at704 and proceed directly to store the location of the marker(s) and/orother information associated with the markers, at 710, without promptinga user to accept the marker(s). In such embodiments, the robotic devicecan analyze the location of the marker(s) prior to storing theirlocation. For example, the robotic device can have previously storedinformation on the location of the marker(s) (e.g., that was acquiredduring a previous scan of the environment and/or inputted in the roboticdevice by a user or compute device), and can compare the location of themarker(s) to that previously stored information to check for accuracyand/or identify changes in marker locations. In particular, if thepreviously stored information indicates that a particular marker shouldbe located in a location that is different from the location identifiedby the robotic device, then the robotic device may initiate anadditional scan of the environment to verify the location of the markerbefore storing its location. Alternatively or additionally, the roboticdevice can send a notification to a user indicating that the location ofa marker has changed. In such instances, the robotic device can storethe new location of the marker but also store a message indicating thatthere has been a change in the marker location. A user or compute devicecan then, at a later point in time, review the message and reconcile thechange in marker location.

Optionally, method 700 can proceed to 712, where the robotic device canprompt a user e.g., via a user interface or other type of I/O device, toselect a set of markers from the marker(s) identified in therepresentation of the environment, as depicted in FIG. 9. The user canmake a selection, and the robotic device can receive the selection fromthe user, at 714. Alternatively, in some embodiments, the robotic devicecan automatically select a set of markers instead of prompting a user tomake a selection. The robotic device can be programmed to select amarker based on certain predefined or learned rules and/or conditions.For example, the robotic device can be instructed to select a markerthat is associated with a particular type of object (e.g., a supplyitem) during certain hours of the day, or when traffic in a building islow. In the latter case, the robotic device can determine when trafficin a building is low by actively moving through the building (e.g.,patrolling and monitoring hallways and rooms) and scanning theenvironment. The robotic device can then know to select markers that areassociated with particular objects at times when traffic in the buildingis lower than a majority of other times.

After the robotic device receives a selection of a set of markers fromthe user and/or has automatically selected a set of markers, method 700can proceed to onto learning a skill, at 716, or executing a skill, at718.

For any particular skill, the robotic device can be taught the skillprior to the robotic device executing or performing the skill. Forexample, to acquire a manipulation skill, the robotic device can betaught using LfD (e.g., kinesthetic teaching), whereby a user or otherrobotic device can demonstrate skills to the robotic device. Forexample, a manipulating element such as an arm of the robotic device canbe moved through a sequence of waypoints to interact with an object. Asanother example, a mobile base of a robotic device (e.g., a base with atransport element such as wheels, tracks, crawlers, etc.) can benavigated around objects within an environment by a user, e.g., by usinga joystick, a user interface, or some other type of physical or virtualcontrol device.

In the case of kinesthetic teaching, a user can physically demonstrateskills to the robotic device. The training or teaching can be performedin a mass-production setting, such as, for example, a manufacturingenvironment, in which the robotic device can be taught using anaggregate model representing a generic performance of a skill.Alternatively or additionally, the teaching can occur onsite after therobotic device has been deployed (e.g., at a hospital), such that therobotic device can learn to perform the skill in the specific siteenvironment. In some embodiments, a robotic device can be taught in anonsite setting, and can then send information associated with thelearned skill to one or more additional robotic devices, such that thoseadditional robotic devices can also have the knowledge of the taughtskill when operating in the same onsite setting. Such embodiments can beuseful when multiple robotic devices are being deployed at a singlesite. Each robotic device can then receive and send information to otherrobotic devices such that they can collectively learn a set of skillsfor that onsite environment.

In the learning mode, depicted in FIG. 10, method 700 proceeds to720-724, where a user can use a LfD teaching process to teach therobotic device a skill. In an embodiment, a skill can be defined asgripping an object located at a particular location, picking up theobject, moving the object to a different location, and setting theobject down in the different location. In another embodiment, a skillcan involve interactions with an environment surrounding the roboticdevice, e.g., a door.

At 720, a user (or another robotic device) can guide the robotic device,including a manipulating element (e.g., manipulating element 250, 350,or 450) and/or a transport element (e.g., transport element(s) 260,460), through a movement. For example, a user can guide the manipulatingelement of the robotic device (and/or other component of the roboticdevice, e.g., a transport element) through a demonstration associatedwith executing a particular skill, e.g., an interaction with a human, anengagement with and/or manipulation of an object, and/or otherinteractions with human(s) and/or a surrounding environment. In anembodiment, a user can demonstrate to the robotic device how to interactwith and/or navigate through a door. The user can demonstrate how therobotic device, e.g., via its manipulating element(s), how to interactwith a handle of the door. For example, a manipulating element of therobotic device, such as an arm, can be guided through a series ofmotions relative to the door handle (e.g., as placed or identified by avisual input or fiducial marker). While or after the door handle isturned, e.g., using the manipulating element, the robotic device can beguided via demonstration to push the door open, e.g., by moving itstransport element(s). The combination of motions of the manipulatingelement(s) and transport element(s) can be used, as further explainedbelow, to construct a model for executing a skill or behavior for futureinteractions with the door and/or similar doors.

In some embodiments, the robotic device can be guided by a user that islocally present, e.g., by the user physically moving and/or providinginputs to the robotic device. In some embodiments, the robotic devicecan be guided by a user located remotely from the robotic device (e.g.,a robot supervisor) via, for example, a remote or cloud interface.

While guiding the manipulating element (and/or other component of therobotic device) through the movement, the user can indicate to therobotic device when to capture information about the state of themanipulating element (e.g., joint configurations, joint forces and/ortorques, end effector configuration, end effector position), anothercomponent of the robotic device, and/or the environment (e.g., alocation of an object associated with a selected marker and/or otherobjects in the environment). For example, the robotic device can receivea signal from a user to capture information about the manipulatingelement and/or environment at a waypoint or keyframe during the movementof the manipulating element, at 722. In response to receiving a signal,at 724, the robotic device can capture the information about themanipulating element, other component of the robotic device, and/orenvironment at that keyframe. The manipulating element information caninclude, for example, joint configurations, joint torques, end effectorpositions, and/or end effector torques. The environmental informationcan include, for example, the position of a selected marker relative tothe end effector, and can indicate to the robotic device when objects inthe environment may have moved. If the movement is still ongoing (728:NO), then the robotic device can wait to capture information about themanipulating element and/or environment at additional keyframes. In someembodiments, the robotic device can be programmed to capture keyframeinformation without receiving a signal from a user. For example, whilethe manipulating element is being moved by a user, the robotic devicecan monitor changes in the segments and joints of the manipulatingelement, and when those changes are exceed a threshold, or when there isa directional change in a trajectory of a segment or joint, the roboticdevice can autonomously select that point to be keyframe and recordinformation about the manipulating element and/or environment at thatkeyframe.

During the movement of the manipulating element (and/or other componentof the robotic device, e.g., transport element), the robotic device canalso continuously or periodically, without receiving a signal from auser, record sensory information, e.g., information about themanipulating element, other element(s) of the robotic device (e.g.,transport element), and/or environment, at 730. For example, the roboticdevice can record information about the trajectories of segments andjoints, as well their configurations, as the user moves the manipulatingelement through the demonstration. During the demonstration, the roboticdevice can also record information about one or more environmentalconstraints, e.g., static information about object(s) within theenvironment (e.g., a location or size of a doorway, a supply bin, etc.)and/or dynamic information about object(s) within the environment (e.g.,a level of traffic in a room, movements of users around the roboticdevice, etc.). In some embodiments, the sensory information recorded bythe robotic device during the demonstration can add to and/or modify oneor more layers of a map of the environment, e.g., as depicted in FIG.19.

In some embodiments, the robotic device can include an audio device(e.g., 244) such as, for example, a microphone, and the demarcation ofkeyframes can be controlled by speech commands. For example, a user canindicate to the robotic device that it plans to give a demonstration byspeaking, “I will guide you.” The demonstration can begin when the userindicates the first keyframe by speaking “start here.” Intermediatekeyframes can be indicated by speaking “go here.” And a final keyframerepresenting the end of the demonstration can be indicated by speaking“end here.” Suitable examples of demonstration teaching are provided inthe Akgun article.

In some embodiments, while demonstrating a skill to the robotic device,a user can indicate which portion(s) of the skill are generic and whichportion(s) of the skill are more specific to a particular environment orsituation, e.g., via one or more inputs into the robotic device. Forexample, while demonstrating to a robotic device how to move suppliesfrom a first location to a second location (e.g., a room) and to dropthose supplies off at the second location, the user can indicate to therobotic device that the action of navigating from the first location tothe second location is generic while the action of dropping off thesupplies at the second location is unique and requires more informationspecific to an environment to implement (e.g., a specific tag or markerto drop off the supplies relative to). When the robotic device lateruses a model for that skill to move supplies between differentlocations, the robotic device can know to request specific informationregarding the drop off and/or scan for that specific information beforeexecuting the skill (e.g., request and/or scan for information regardinga specific tag to determine its location before performing the dropoff).

In some embodiments, a user can indicate after demonstrating a skill toa robotic device those portions of the skill that are generic orspecific and/or modify those portions of the skill that the user haspreviously indicated to be generic or specific. In some embodiments, arobotic device after learning a set of skills may determine thatportions of the set of skills are generic or specific. In someembodiments, the robotic device can recommend that portions of the setof skills are generic or specific to a user, and request confirmationfrom the user. Based on the user's confirmation, the robotic device canstore the information for reference when learning and/or executingfuture skills associated with the set of skills. Alternatively, therobotic device can automatically determine and label different portionsof skills as being generic or specific without user input.

Once the movement or demonstration is complete (728: YES), the roboticdevice can generate a model for the demonstrated skill based on a subsetof all the sensory information (e.g., manipulating element information,transport element information, environmental information) that has beenrecorded. For example, at 732, the robotic device can optionally prompta user, e.g., via a user interface or other type of I/O device, for aselection of features that are relevant to learning the skill, and at734, the robotic device can receive the selection of features from theuser. Alternatively or additionally, the robotic device can know toselect certain features to use in generating the model based on previousinstructions from a user. For example, the robotic device can recognizethat it is being demonstrated to pick up an object, e.g., based on thesensory information, and can automatically select one or more featuresof the sensory information (e.g., joint configuration, joint torques,end effector torques) to include as relevant features for generating theskill model, e.g., based on past demonstrations of picking up the sameor different objects.

At 736, the robotic device can generate the model of the skill using theselected features. The model can be generated using stored machinelearning libraries or algorithms (e.g., machine learning libraries 342).In some embodiments, the model can be represented as a HMM algorithmthat includes a plurality of parameters, such as, for example, a numberof hidden states, a feature space (e.g., features included in a featurevector), and emission distributions for each state modeled as a Gaussiandistribution. In some embodiments, the model can be represented as asupport vector machine or “SVM” model, which can include parameters suchas, for example, a kernel type (e.g., linear, radial, polynomial,sigmoid), a cost parameter or function, weights (e.g., equal, classbalanced), a loss type or function (e.g., hinge, square-hinge), and asolving or problem type (e.g., dual, primal). The model can beassociated with the relevant sensory information and/or other sensoryinformation recorded by the robotic device during the skilldemonstration. The model can also be associated with marker informationindicating the set of markers that were manipulated during the skilldemonstration and/or features associated with one or more physicalobjects tied to those markers. The robotic device can store the model ina memory (e.g., storage 230 or 330), at 738.

In some embodiments, information associated with a demonstration (e.g.,sensory information, model, etc.) can be used to add to and/or modifythe layers of a map or representation of the environment, e.g., such asthe representation 1600 depicted in FIG. 19. For example, FIG. 21depicts

Optionally, at 740, the robotic device can determine whether the userwill be performing another demonstration of the skill. If anotherdemonstration is to be performed (740: YES), then method 700 can returnto 720, where a user (or other robotic device) can guide the roboticdevice through an additional demonstration. If the demonstrations arecomplete (740: NO), then method 700 can optionally return to thebeginning and perform a new scan of the environment. Alternatively, insome embodiments, method 700 can terminate.

In another embodiment, a skill can be a navigation behavior, such asnavigating between two locations or navigating around and/or through anobject in the environment. Similar to learning a skill with amanipulation element, such as described herein, a user (or other roboticdevice) can guide the robotic device including a set of transportelements (e.g., transport element(s) 260, 460) through the navigationbehavior, at 720. For example, a user can use a joystick or othercontrol device to control the movement of the set of transport elements(and/or other components of the robotic device, e.g., a manipulationelement) such that the robotic device can perform the navigationbehavior. While controlling the movement of the set of transportelements (and/or other components of the robotic device), the user cansignal to the robotic device when to capture sensory information, suchas information about the state of the set of transport elements (e.g.,angles of each transport element, configurations of each transportelement, spacing between transport elements if movable relative to oneanother, etc.), other components of the robotic device, and/or theenvironment (e.g., location of the robotic device in a map, locationand/or boundary of an object in the environment). The user can signal tothe robotic device at keyframes during the movement of the set oftransport elements (and/or other components of the robotic device),e.g., at a starting point, an ending point, and/or transition pointsbetween moving the set of transport elements in a first direction and asecond direction. In response to receiving a signal from the user, at722, the robotic device can capture a snapshot of movement (includinginformation about the set of transport elements, other components of therobotic device, and/or the environment) at the keyframe. Alternativelyor additionally, the robotic device can be configured to autonomouslyselect points during the movement of the set of transport elements(and/or other components of the robotic device) to capture a snapshot ofthe movement. For example, the robotic device can monitor the angleand/or configuration of the set of transport elements and capture asnapshot of that information whenever the robotic device detects achange in the angle and/or configuration. In some embodiments, therobotic device can be configured to continuously collect information onthe set of transport elements, other components of the robotic device,and/or the environment while the user controls the movement of therobotic device.

Similar to learning a skill with a manipulating element, as describedherein, the robotic device can continue to capture sensory informationassociated with the movement of the transport elements (and/or othercomponents of the robotic device) until the movement is completed (728:YES). The robotic device can optionally receive a selection of featuresthat are relevant to learning the navigation behavior and/orautonomously identify relevant features in the collected sensoryinformation (e.g., by recognizing that it is being demonstrated aparticular navigation skill and identifying those features that arerelevant to learning that skill), at 732-734. The robotic device canthen generate a model for the navigation behavior based on the sensoryinformation associated with the relevant features, at 736, and storethat model and the sensory information such that it can be used togenerate a trajectory for the robotic device to execute at a later pointin time, at 738.

An example of a navigation behavior is to navigate through a doorway.The robotic device can be configured to navigate through standarddoorways within a building, but the robotic device may not be capable ofnavigating through a non-standard doorway, e.g., a small or oddly shapeddoorway. Accordingly, the robotic device may prompt a user todemonstrate to the robotic device how it can safely navigate through thedoorway. The robotic device can navigate to one side of the doorway andthen have a user demonstrate to the robotic device how to pass throughthe doorway to the other side. During the demonstration, the roboticdevice can passively record information such as a location of therobotic device in a map and/or sensor data (e.g., a boundary of thedoor, a configuration of the transport elements and/or manipulationelements, and other sensory information as described herein). In someembodiments, the robotic device can learn how to navigate through thedoorway using an interactive learning template, as further discussedherein with reference to FIG. 16.

In some embodiments, the user can be onsite near the robotic device. Inother embodiments, the user can be located at a remote location and cancontrol the robotic device from a compute device (e.g., server 120,compute device 150) via a network connection to the robotic device(e.g., such as depicted in FIG. 1 and described above).

In some embodiments, as described herein, the robotic device canactively scan its surrounding environment to monitor changes in theenvironment. Accordingly, during learning and/or execution, the roboticdevice can engage in continuous scanning of the environment, and updatethe representation of the environment accordingly, as well as theenvironmental information that it has stored.

In some embodiments, the robotic device can be configured to learnsocially appropriate behaviors, i.e., actions that account forinteractions with humans. For example, the robotic device can beconfigured to learn a manipulation or navigation skill that is performedaround humans. The robotic device can learn the socially appropriatebehavior via a demonstration by a human operator. In some embodiments,the robotic device can be configured to learn the behavior in aninteractive setting, e.g., via a human operator intervening in theautonomous execution of a skill by the robotic device and thendemonstrating to the robotic device how to execute the skill. Therobotic device, upon detecting an intervention by the human operator,can be configured to switch to a learning mode where the robotic deviceis controlled by the human operator and passively records informationassociated with the demonstration and the perceptual context in whichthe demonstration is performed. The robotic device can use thisinformation to generate a modified model of the skill and to associatethat model with the appropriate social context in which the roboticdevice should execute that skill at a later point in time. As anexample, a robotic device may detect that a human operator hasintervened in its autonomous execution of a navigation plan and, inresponse to detecting the intervention, switch to a learning model ofbeing controlled by the human operator. The robotic device can recordinformation about its surrounding environment and its own movements asthe human operator demonstrates how the navigation plan should bemodified in the presence of humans, e.g., that the robotic device shouldmove aside in a hallway to let a human pass instead of waiting for thehallway to clear. Further details regarding interactive learning isdescribed herein, with reference to FIG. 16.

In the execution mode, depicted in FIG. 11, the robotic device canoptionally prompt, e.g., via a user interface or other type of I/Odevice, a user to select a model, such as a model that was previouslygenerated by the robotic device in the learning mode, at 750. Therobotic device can receive the model selection, at 752. In someembodiments, the robotic device can receive the model selection from auser, or alternatively, the robotic device can automatically select amodel based on certain rules and/or conditions. For example, the roboticdevice can be programmed to select a model when it is in a certain areaof a building (e.g., when it is in a certain room or floor), duringcertain times or day, etc. Alternatively or additionally, the roboticdevice can know to select a certain model based on the selected set ofmarkers. At 754, the robotic device can determine whether to move closerto a selected marker prior to generating a trajectory and executing askill with respect to the selected marker. For example, the roboticdevice can determine based on the selected set of markers and theselected model whether it should move to be better positioned to executethe skill (e.g., to be closer or more proximate to the marker, to befacing the marker from a certain angle). The robotic device may makethis determination based on the sensory information that was recordedduring a demonstration of the skill. For example, the robotic device mayrecognize that it was positioned closer to the marker when it wasdemonstrated the skill and accordingly adjust its position.

If the robotic device determines to move with respect to the selectedmarker (754: YES), the robotic device can move its position (e.g.,adjust its location and/or orientation), at 756, and method 700 canreturn to 702, where the robotic device scans the environment again toobtain a representation of the environment. Method 700 can proceed backthrough the various steps to 754. If the robotic device determines notto move with respect to the selected marker (754: NO), then the roboticdevice can generate an action trajectory, e.g., for a manipulatingelement of the robotic device.

Specifically, at 758, the robotic device can compute a function thattransforms (e.g., translates) between a selected set of markers and aset of markers associated with the selected model (e.g., the marker(s)that were selected when the robotic device learned the skill, i.e.,generated the selected model), referred to herein as a “stored marker”or “stored set of markers.” For example, the robotic device can betaught a skill using a first set of markers that were at specificlocation(s) and/or orientation(s) relative to a portion of the roboticdevice, such as, for example, the end effector, and later, the roboticdevice can be executing the skill with a second set of markers that areat different location(s) and/or orientation(s) relative to themanipulating element. In such instances, the robotic device can computea transformation function that transforms between the positon(s) and/ororientation(s) of first set of markers and the second set of markers.

At 760, the robotic device can use the computed transformation functionto transform the position and orientation of a portion of themanipulating element, e.g., the end effector of the manipulatingelement, in each keyframe that was recorded when the skill was taught.Optionally, at 762, the robotic device can account for any environmentalconstraints, e.g., a feature of an object and/or an area within theenvironment, such as a size, configuration, and/or location. The roboticdevice can limit the movement of the manipulating element based on theenvironmental constraints. For example, if the robotic device recognizesthat it will be executing the skill in a supply room, the robotic devicecan take into account a size of the supply room when transforming theposition and orientation of the manipulating element to avoid having aportion of the manipulating element come into contact with a wall orother physical structure within the supply closet. The robotic devicecan also take into account other environmental constraints associatedwith the supply room, such as, for example, a size of a bin within thesupply room, a location of a shelf within the supply room, etc. Therobotic device can be provided information regarding environmentalconstraint(s) and/or taught environmental constraint(s) in advance ofexecuting skills in a setting with the environmental constraint(s), asfurther described herein with reference to FIG. 17.

Optionally, at 762, the robotic device can use inverse kinematicequations or algorithms to determine the configuration of the joints ofthe manipulating element for each keyframe, at 762. The position andorientation of the end effector and the set of markers can be providedin a task space (e.g., the Cartesian space where the robotic device isoperating), while the orientation of the joints can be provided in ajoint or configuration space (e.g., a nth dimensional space associatedwith the configuration of the manipulating element, where the roboticdevice is represented as a point and n is the number of degrees offreedom of the manipulating element). In some embodiments, the inversekinematic calculations can be guided by the joint configurationinformation recorded when the robotic device was taught the skill (e.g.,the configuration of the joints recorded during a teaching demonstrationusing the manipulating element). For example, the inverse kinematicscalculations can be seeded (e.g., provided with an initial guess for thecalculation, or biased) with the joint configurations that were recordedat each keyframe. Additional conditions can also be imposed on theinverse kinematics calculations, such as, for example, requiring thatthe calculated joint configurations do not deviate more than apredefined amount from a joint configuration in an adjacent keyframe. At764, the robotic device can plan the trajectory between the jointconfigurations from one keyframe to the next keyframe, e.g., in thejoint space, to generate a complete trajectory for the manipulatingelement to execute the skill. Optionally, the robotic device can takeinto account environmental constraints, as discussed herein.

In some embodiments, after the robotic device transforms the positionand orientation of the portion of the manipulating element (e.g., theend effector), the robotic device can plan the trajectory for themanipulating element in the task space. In such embodiments, method 700can proceed from 760 directly to 764.

At 766 and 768, the robotic device can optionally present the trajectoryto a user and prompt the user, e.g., via a user interface or other I/Odevice, to accept or reject the trajectory. Alternatively, the roboticdevice can accept or reject the trajectory based on internal rulesand/or conditions, and by analyzing relevant sensory information. If thetrajectory is rejected (768: NO), then the robotic device can optionallymodify one or more parameters of the selected model, at 770, andgenerate a second trajectory, at 758-764. The parameters of the modelcan be modified, for example, by selecting different features (e.g.,different sensory information) to include in the model generation. Insome embodiments, where the model is a HMM model, the robotic device canchange the parameters of the model based on a determined success orfailure, in which the robotic device tracks a log likelihood ofdifferent models having different parameters and selects the model witha higher log likelihood than the other models. In some embodiments,where the model is a SVM model, the robotic device can change theparameters by changing the feature space or configuration parameters(e.g., kernel type, cost parameter or function, weights), as describedherein.

If the trajectory is accepted (768: YES), then the robotic device canmove the manipulating element to execute the generated trajectory, at772. While the manipulating element is executing the planned trajectory,the robotic device, e.g., via one or more sensors on the manipulatingelement and other components of the robotic device, can record and/orstore sensory information, such as, for example, information about themanipulating element and/or environment, at 774.

Optionally, at 774, the robotic device can determine whether theexecution of the skill was successful (e.g., whether an interaction withan object meets predefined success criteria). For example, the roboticdevice can scan the environment and determine whether the current stateof the environment and the robotic device, including, for example, thelocations of one or more objects, and/or the position or orientation ofthose objects relative to the manipulating element or another componentof robotic device, and determine whether that current state aligns withpredefined success criteria. The predefined and/or learned successcriteria can be provided by a user, or, in some embodiments, provided bya different robotic device and/or compute device. The predefined and/orlearned success criteria can indicate information about differentfeatures of the environment and/or the robotic device that areassociated with success. In some embodiments, a user may also provide aninput indicating to the robotic device that the execution wassuccessful.

In a particular example, where a skill is defined as gripping an objectat a particular location and picking up the object, success for theskill can be taught and/or defined as detecting that one or more markersassociated with the object are in a specific relationship with eachother and/or the robotic device, or detecting that a sufficient force ortorque is being experienced (or was experienced) by the end effector ora joint of the manipulating element (e.g., a wrist joint), signifyingthat the manipulating element is supporting the weight of the object andtherefore has picked up the object. If the execution was not successful(776: NO), then the robotic device may optionally modify the parametersof the model, at 770, and/or generate a new trajectory, at 758-764. Ifthe execution was successful (776: YES), then the data associated withthe successful interaction (e.g., data indicating that the execution wassuccessful and how it was successful) can be recorded, and method 700can optionally return to the beginning and perform a new scan of theenvironment. Alternatively, in some embodiments, method 700 canterminate.

In some embodiments, a robotic device is configured to execute a skillinvolving operation of one or more of a manipulating element, atransport element, and/or another components of the robotic device(e.g., a head, eyes, a sensor, etc.). The robotic device can beconfigured to plan a trajectory for the skill, similar to that describedabove with respect to a manipulating element. For example, at 750-752,the robotic device can prompt and receive a user selection of a modelfor the skill, or alternatively, autonomously select a model for theskill. At 758, the robotic device can compute a function that transformsbetween a set of markers currently identified in the environment and aset of stored markers associated with the selected model (e.g., themarker(s) that were identified and stored with the model for the skillwhen the robotic device learned the skill). At 760, the robotic devicecan use the computed function to transform the configuration of the oneor more components of the robotic device at each keyframe. At 764, therobotic device can plan a trajectory for the one or more components ofthe robotic device between each transformed keyframe. While transformingthe keyframes and/or planning the trajectory, the robotic device canoptionally take into account any environmental constraints associatedwith the setting in which the skill is being executed. At 772, therobotic device can implement movements of the one or more components ofthe robotic device according to the planned trajectory. In someembodiments, the robotic device can determine a joint configuration, at762, present the planned trajectory to a user, at 766, and/or performother optional steps as illustrated in FIG. 11.

FIG. 12 is a block diagram showing a system architecture for roboticlearning and execution, including actions performed by a user, accordingto some embodiments. A system 800 can be configured for robotic learningand execution. System 800 can include one or more robotic devices, suchas any of the robotic devices described herein, and can execute modules,processes, and/or functions, depicted in FIG. 12 as active scanning 822,marker identification 824, learning and model generation 826, trajectorygeneration and execution 828, and success monitoring 829. Activescanning 822, marker identification 824, learning and model generation826, trajectory generation and execution 828, and success monitoring 829can correspond to one or more steps performed by a robotic device, asdescribed in reference to method 700, depicted in FIGS. 8-11. Forexample, active scanning 822 include step 702 of method 700; markeridentification 824 can include one or more of steps 704-710 of method700; learning and model generation 826 can include one or more of steps712-738; trajectory generation and execution 828 can include one or moreof steps 712, 714, 718, and 750-774; and success monitoring 829 caninclude one or more of steps 774 and 776.

System 800 can be connected (e.g., in communication with) one or moredevices, including, for example, camera(s) 872, an arm 850 (including agripper 856 and sensor(s) 870), a display device 842, and a microphone844. System 800, via display device 842, microphone 844, and/or otherI/O device (not depicted) can receive inputs from a user associated withone or more user actions 880. User actions 880 can include, for example,882, a user accepting markers or requesting a rescan of an environment,884, a user selecting marker(s), 886, a user selecting relevantinformation to generate a model, 888, a user selecting a model forexecuting a skill, 890, a user accepting a trajectory for executing askill, 892, a user confirming success of an executed skill, 894, a userteaching a skill via kinesthetic learning.

For active scanning 822, system 800 use camera(s) 872 to scan anenvironment and record sensory information about the environment,including information associated with one or more markers in theenvironment. For marker identification 824, system 800 can analyze thesensory information to identify one or more markers in the environment,and receive input(s) from a user, e.g., via display device 842,indicating 822, the user accepting the markers or requesting a rescan ofthe environment. For learning and model generation 826, system 800 canreceive sensory information collected by camera(s) 872 and/or sensor(s)870 on arm 850, and use that information to generate a model for askill. As part of learning and model generation 826, system 800 canreceive input(s) from a user, e.g., via display device 842 and/ormicrophone 844, indicating 884, that the user has selected a set ofmarker(s) for teaching the skill, 886, that a user has selected certainfeatures of recorded sensory information to use in generating the model,and/or 894, that the user is demonstrating the skill. For trajectorygeneration and execution 828, system 800 can generate a plannedtrajectory and control movements of arm 850 to execute the trajectory.As part of trajectory generation and execution 828, system 800 canreceive input(s) from a user, e.g., via display device 842, indicating888, that the user has selected a model for generating a trajectoryand/or 890, that the user has accepted or rejected the generatedtrajectory. For success monitoring 829, system 800 can determine whetherthe execution of a skill was successful by analyzing sensory informationrecorded by sensor(s) 870 during execution of the skill. As part ofsuccess monitoring 829, system 800 can receive input(s) from a user,e.g., via display device 842 and/or microphone 844, indicating 892, thatthe user has confirmed that the execution was successful.

While specific device(s) and/or connections between system 800 and thosedevice(s) are depicted in FIG. 12, it is understood that additionaldevice(s) (not depicted) can communicate with system 800 to receive fromand/or send information to system 800, according to any of theembodiments described herein.

FIGS. 13-17 are flow diagrams illustrating methods 1300 and 1400 thatcan be performed by a robotic system (e.g., robotic system 100)including one or more robotic devices, according to embodimentsdescribed herein. For example, methods 1300 and/or 1400 can be performedby a single robotic device and/or multiple robotic devices.

As depicted in FIG. 13, a robotic device is configured to operate in anexecution mode, at 1301. In the execution mode, the robotic device canautonomously plan and execute actions within an environment. Todetermine which actions to execute and/or plan how to execute an action,the robotic device can scan the environment and collect information onthe environment and/or objects within the environment, at 1304, and usethat information to build and/or change a representation or map of theenvironment (e.g., representation 1600), at 1305. The robotic device canrepeatedly (e.g., at predefined times and/or time internals) orcontinuously scan the environment and update its representation of theenvironment based on the information that it collects on theenvironment. Similar to methods described above, the robotic device cancollect information on the environment using one or more sensors (e.g.,sensor(s) 270 or 470, and/or image capture device(s) 472).

In some embodiments, the robotic device can repeatedly and/orcontinuously scan the environment, at 1304, for data tracking and/oranalytics, at 1307. For example, the robotic device can move through anenvironment (e.g., a building such as a hospital), autonomously orremotely driven (e.g., by a robot supervisor), and collect dataregarding the environment, objects within the environment, etc. Thisinformation can be used, for example, by a data tracking & analyticselement (e.g., data tracking & analytics element 1606) that managessupplies and/or equipment for a hospital. In some embodiments, therobotic device can collect information on various humans (e.g.,patients) to track the behavior of such patients (e.g., compliance withmedications and/or treatments), run diagnostic tests, and/or conductscreening, etc.

In some embodiments, the robotic device can repeatedly and/orcontinuously scan the environment, at 1304, for modelling and learningpurposes, at 1309. For example, the robotic device can collect dataregarding its environment and/or objects within that environment (e.g.,including humans and their behavior in response to robot actions), anduse that information to develop new behavior and/or actions. In someembodiments, the robotic device can collect large amounts of informationabout an environment, which can be used by the robotic device to furtherrefine and/or generate models specific to that environment. In someembodiments, the robotic device can provide this information to a robotsupervisor (e.g., a remote user) that can use the information to furtheradapt the robotic device to a specific environment, e.g., by generatingand/or modifying skill models and/or behaviors within that environment.The robot supervisor can in repeatedly (e.g., at specific intervals oftime) and/or continuously (e.g., in real time) tweak the informationthat is collected by the robotic device and/or other robotic devicesand/or the parameters of the models that the robotics are using. In someembodiments, the robot supervisor via this active exchange ofinformation with the robotic device can repeatedly and/or continuouslyadapt the robotic device for a particular environment. For example, therobotic supervisor can modify planned paths that a robotic device may beusing to navigate through an environment, which can in turn change theinformation and/or model(s) used by the robotic device to generatemotion for its transport element. These changes, both provided by therobot supervisor and/or made by the robotic device, can feed into one ormore layers of a map (e.g., map 1600) of the environment, such as, forexample, a social or semantic layer of the map.

At 1306, the robotic device can determine which action(s) to executewithin the environment, i.e., perform arbitration on a set of resourcesor components capable of executing certain actions. The robotic devicecan select between different actions based on one or more arbitrationalgorithms (e.g., arbitration algorithm(s) 1758). The robotic device canperform arbitration autonomously and/or with user input. For example,the robotic device can be configured to request user input when therobotic device cannot determine a current state of one or more of itscomponents and/or an object within the environment, or when the roboticdevice cannot determine which action to execute and/or plan how toexecute a selected skill. In some embodiments, the robotic device can beconfigured to autonomously select among a set of actions when therobotic device is familiar with a particular setting (e.g., haspreviously learned and/or executed actions in the setting) and torequest user input when the robotic device encounters a new setting.When the robotic device encounters the new setting, the robotic devicecan request that a user select the appropriate action for the roboticdevice to execute or, alternatively, select an action and prompt a userto confirm the selection of the action.

In some embodiments, a human operator (e.g., a robot supervisor) canalso monitor the robotic device and send a signal to the robotic devicewhen the human operator wants to intervene in the execution of an actionby the robotic device. The human operator can be located near therobotic device and/or at a remote compute device that is connected via anetwork to the robotic device or a nearby device that can be used tomonitor the robotic device. The human operator may decide to intervenein the execution of an action by the robotic device, e.g., when thehuman operator wants to teach the robotic device a new skill orbehavior, for safety reasons (e.g., to prevent harm to a human or theenvironment surrounding a robotic device), and/or to prevent injury tothe robotic device.

In response to determining that user input is needed (e.g., because ofan unfamiliar setting or in response to a signal from a user) (1312:YES), the robotic device can optionally prompt a user to provide a userinput, at 1314. For example, the robotic device may cause an onboarddisplay or a display located at a remote device (e.g., via a remote orcloud interface) to display a prompt to the user requesting the userinput. The robotic device can receive a user input, at 1315, and performarbitration based on that user input. When user input is not needed(1312: NO), the robotic device continues to autonomously performarbitration.

After the robotic device has selected an action to execute, the roboticdevice can plan and execute the action, at 1308. As discussed above, theaction can be associated with a task, such as a manipulation action(e.g., involving a manipulating element such as those described herein)or a movement (e.g., involving a transport element such as thosedescribed herein), and/or a social behavior. While executing the action,the robotic device can continue to scan its surrounding environment, at1304. When the robotic device detect a change in its current stateand/or a state of the environment (e.g., location of objects within theenvironment), the robotic device can evaluate the change to determinewhether it should interrupt execution of the action, at 1310. Forexample, the robotic device can determine to interrupt execution of theaction in response to detecting a physical engagement between one ormore of its components and a human or other object in the environment(e.g., when a human operator touches a manipulating element), or whendetecting the presence of a human or other object nearby. Additionallyor alternatively, the robotic device can determine to interruptexecution of the action in response to receiving a signal from a user,e.g., a robot supervisor. In some embodiments, the robotic device can beconfigured in advance to interrupt execution of an action at specificpoints during the execution of an action, e.g., when the robotic devicemay need a user to demonstrate a part of the action, as defined in aninteractive learning template. Further details regarding learning usinginteractive learning templates is explained herein, with reference toFIG. 15.

If the robotic device determines to interrupt execution of the action(1310: YES), then the robotic device may determine whether user input isneeded, at 1312. As discussed above, if the robotic device determinesthat user input is required (1312: YES), then the robotic device mayoptionally prompt for a user input and/or receive a user input, at1314-1315. The robotic device can then return to scanning itssurrounding environment, at 1304, performing arbitration on a set ofresources, at 1306, and/or executing an action, at 1308. Optionally, asdepicted in FIGS. 14 and 15, when the robotic device determines thatuser input is required, the robotic device can switch to a learningmode, at 1402, and then proceed to learn a skill, at 1404, or to learnan environmental constraint, at 1406. If the robotic device determinesthat user input is not needed (1312: NO), then the robotic device canreturn to scanning its surrounding environment, at 1304, performingarbitration on a set of resources, at 1306, and/or executing an action,at 1308. If the robotic device determines that the action does not needto be interrupted (1310: NO), then the robotic device can continue toexecute the action, at 1308.

In some embodiments, the user input received by the robotic device,e.g., at 1315, can include feedback from a user regarding whether theselection and/or execution of a skill was appropriate and/or successful.For example, a user can tag an action (e.g., a task or behavior) beingexecuted by the robotic device (or having previously been executed bythe robotic device) as a positive or negative example of the action. Therobotic device can store this feedback (e.g., as success criteriaassociated with the action and/or other actions) and use it to adjustits selection and/or execution of that action or other actions in thefuture.

In some embodiments, the robotic device is configured to engage insocially appropriate behavior. The robotic device may be designed tooperate in an environment with humans. When operating around humans, therobotic device can be configured to continually plan for how its actionsmay be perceived by a human, at 1306-1308. For example, when the roboticdevice is moving, the robotic device can monitor its surroundingenvironment for humans, at 1304, and engage in social interactions withany humans that it encounters, at 1308. When the robotic device is notexecuting any task (e.g., is stationary), the robotic device cancontinue to monitor its surrounding environment for humans, at 1304, anddetermine whether it may need to execute one or more sociallyappropriate behaviors based on how a human may perceive its presence, at1306. In such embodiments, the robotic device may be configured with anunderlying framework (e.g., one or more arbitration algorithms) thatplans for and executes behavior that is socially appropriate given anyparticular context or setting. The robotic device can be configured tomanage multiple resources (e.g., a manipulating element, a transportelement, a humanoid component such as a head or eyes, a sound generator,etc.) and to generate appropriate behavior based on one or more of theseresources.

FIG. 18 provides an example of components of a robotic device performingarbitration on an attention mechanism (e.g., an eye element). Forexample, a robotic device may have a camera or other element that can beperceived by a human as having an eye gaze (e.g., an eye element).Humans near the robotic device may perceive the direction that the eyeelement is directed toward as being what the robotic device is payingattention to. Accordingly, the robotic device can be configured todetermine where the eye element is pointed toward when the roboticdevice is operating and/or near a human. The robotic device cancontinually arbitrate the use of the eye element such that the roboticdevice maintains socially appropriate behavior when operating aroundand/or being in the presence of humans.

As depicted in FIG. 18, different components of the robotic device canrequest eye gaze targets (e.g., to have the eye element directed in aparticular direction). A first component (e.g., a camera, laser, orother sensor) associated with active scanning can determine that anobject is in the field of view of the eye element, at 1510. The firstcomponent can further determine that the object is a human, at 1514. Inresponse to determining that the object is a human, the first componentcan send a request 1516 to a central resource manager 1508 (e.g., acontrol unit (e.g., control unit(s) 202, 302, and/or 1702) and/or acomponent of the control unit) to have the eye element directed toward aface of the human. A second component (e.g., a camera, laser, or othersensor) associated with a navigation action can determine that aparticular location (e.g., an intermediate or final destination of thenavigation) is within sight of the eye element, at 1520. In response todetermining that the location is in sight, the second component can senda request 1522 to the central resource manager 1508 to have the eyeelement directed toward the location. A third component (e.g., a sensoron a manipulation element, a camera, or other sensor) associated with amanipulation action can determine that the robotic device is engagedwith an object (e.g., the robotic device is carrying an object, a humanhas touched the robotic device), at 1530. In response to determiningthat the robotic device is engaged with the object, the third componentcan send a request 1532 to the central resource manager 1508 to have theeye element directed toward the object.

The central resource manager 1508 receives the requests 1516, 1522, and1532 from the components of the robotic device and can performarbitration to determine which direction to direct the eye element, at1540. When determining which direction to direct the eye element, thecentral resource manager 1508 can take into account any defined rules orconstraints, e.g., socially appropriate constraints. These defined rulescan include, for example, not moving the eye element twice during apredefined period of time (e.g., about five seconds), having the eyeelement to face a direction for at least a predefined minimum period oftime (e.g., about five seconds), having the eye element move after apredefined maximum period of time which can be different for a non-humanobject versus a human, etc. The defined rules can be encoded into anarbitration algorithm for managing the eye gaze resource.

In some embodiments, the central resource manager 1508 can performarbitration of the eye gaze resource based on social context and otherinformation collected on an environment. For example, the robotic devicecan be configured to associate certain behavior with particularlocations, e.g., waiting for humans to pass through a busy doorway orhallway before attempting to navigate through the doorway or hallway,operating with less sound (e.g., refraining from using sound and/orspeech functionality) in a quiet area, etc. The robotic device can beconfigured to capture social context information associated with suchlocations and add it into a representation of the environment used fornavigation (as described herein with reference to FIG. 19). In someembodiments, a human operator can also provide the robotic device withsocial context information, which the robotic device can adapt over timeas the robotic device learns more about the environment in which itoperates.

FIGS. 15-17 illustrate flow diagrams of a robotic device operating in alearning mode. As discussed above, a robotic device can operate in anexecution mode and switch to operating in a learning mode, e.g., whenthe robotic device determines that it requires user input, at 1312.Alternatively or additionally, a robotic device can be set to operate ina learning mode, e.g., when the robotic device is initially deployed ina new environment (e.g., a new area, building, etc.). The robotic devicecan operate in the learning mode until a user indicates to the roboticdevice that it can switch to operating in the execution mode and/or therobotic device determines that it can switch to operating in theexecution mode.

When operating in the learning mode, the robotic device can learn askill, at 1404, and/or learn an environmental constraint, at 1406. Therobotic device can be configured to learn a skill with or without anexisting model of the skill (e.g., a generic model of the skill). Whenlearning a skill without an existing model (e.g., learning a skillwithout relying prior knowledge), the robotic device can generate amodel for the skill after being guided through a demonstration of theskill, as described herein with respect to FIG. 10. When learning askill with an existing model (e.g., a generic model of the skill), therobotic device can initiate execution of the skill with the existingmodel and request user input when the robotic device needs a user todemonstrate a part of the skill. The existing model of the skill can beor form a part of an interactive learning template, i.e., a template forguiding the robotic device to learn a skill with input from a user atspecified points during the execution of the skill.

FIG. 16 depicts a process for learning a skill using an interactivelearning template. A robotic device may be deployed in a specificsetting, e.g., a building. The robotic device may make a selection of anexisting model of a skill, at 1410. The existing model of the skill maybe a generic model of the skill that is not specialized to theenvironment or setting in which the robotic device is operating. At1412, the robotic device can generate a plan for executing the skillusing the skill model, according to a process similar to the processdiscussed herein with reference to FIG. 11. At 1414, the robotic devicecan initiate the execution of the skill by performing one or more stepsor parts of the skill that do not require user input and/orspecialization. Upon reaching a part of the skill (e.g., a part wherethe robotic device cannot determine how to execute, or a part of theskill that requires specialization to the setting in which the roboticdevice is executing the skill) (1416: YES), the robotic device canprompt a user to provide a demonstration of that part of the skill, at1418. In some embodiments, the interactive learning template mayindicate to the robotic device when it should prompt a user for ademonstration. Alternatively or additionally, the robotic device mayautonomously determine that it requires user input to be able execute apart of the skill, e.g., when the robotic device cannot determine thestate of an object in the environment and/or generate a plan to executea particular part of the skill given constraints imposed by theenvironment. In some embodiments, a user (e.g., a robot supervisor) mayalso be monitoring the robotic device's execution of the skill and senda signal to the robotic device when the user wants to demonstrate a partof the skill to the robotic device. The robotic device, upon receivingthe signal from the user, can then determine to proceed to 1418.

At 1420, a user can guide the robotic device through a movement. Forexample, the user can move one or more components of the robotic device(e.g., a manipulating element, a transport element, etc.) to demonstratethe part of the skill. While guiding the robotic device through themovement, the user can optionally indicate to the robotic device when tocapture information about the state of one or more of its componentsand/or the environment, at 1422. For example, the robotic device canreceive a signal from the user to capture sensory information, includinginformation about the manipulating element, transport element, and/orenvironment, at a keyframe during the movement of the robotic device.Alternatively, the robotic device can autonomously determine when tocapture sensory information. For example, while the robotic device isbeing moved by a user, the robotic device can monitor changes in one ormore of its components, and when those changes exceed a threshold, orwhen there is a directional change in a trajectory of a component, therobotic device can autonomously select that point to be keyframe andrecord information about the robotic device and/or environment at thatkeyframe. In response to receiving a signal from a user or autonomouslydetermining to capture sensory information, the robotic device cancapture sensory information using one or more of its sensors, at 1424.During the movement of the robotic device, the robotic device can alsocontinuously or periodically, without receiving a signal from a user,record sensory information, at 1430.

Once the movement or demonstration is complete (1426: YES), the roboticdevice can optionally receive a selection of features that are relevantto learning the part of the skill that has been demonstrated, at 1432.In some embodiments, the robotic device can autonomously make aselection of features and/or prompt a user to confirm the selection madeby the robotic device, at 1432. At 1436, the robotic device can storethe information associated with the demonstration. If the execution ofthe skill is not complete (1417: NO), then the robotic device continueswith its execution of the skill, at 1414, and can prompt the user foradditional demonstrations of parts of the skill, as needed, at 1418. Therobotic device can continue to cycle through the interactive learningprocess until the execution of the action is complete (1417: YES), atwhich point the robotic device can generate a model for the skill withparts that are specialized to the setting in which the robotic deviceexecuted the skill, at 1438. The robotic device can then learn anotherskill and/or environmental constraint. Alternatively, if the roboticdevice does not need to learn another skill and/or environmentalconstraint, the robotic device can switch into its execution mode, at1301, and begin scanning an environment, performing arbitration, and/orexecuting actions.

With interactive learning templates, a robotic device taught and/orprovided an initial set of models for skills. The initial set of modelscan be developed before the robotic device is deployed on-site in aspecific environment (e.g., a hospital). For example, this initial setof models can be developed in a factory setting or at a traininglocation, and made available to the robotic device. Once deployedon-site, the robotic device can adapt or specialize the initial set ofmodels to the environment, e.g., via an interactive learning session, asdescribed herein. Moreover, as new models are developed (e.g., off siteat a factory or training location), the new models can be made availableto the robotic device, e.g., via a network connection. Accordingly,systems and methods described herein enable a user and/or entity tocontinue to develop new models for skills and provide those to roboticdevices, even after those robotic devices have been deployed on-site.

An example of an interactive learning session can involve adapting ageneric model for moving an item into a room. The robotic device can beequipped with the generic model and deployed on-site at a hospital. Whendeployed at the hospital, the robotic device can autonomously initiateexecution of the skill to move an item into a patient room, at1412-1414. For example, the robotic device can autonomously navigate tothe patient room using a map of the hospital. Once the robotic devicehas navigated to the patient room, the robotic device can determine thatit needs a user to demonstrate where to drop off the item in the patientroom (1416: YES). The robotic device can prompt a user to move it to thespecific drop-off location, at 1418. The user can control the movementof the robotic device, e.g., using a joystick or other type of controldevice, at 1420. As discussed above, the user can be located on-sitenear the robotic device or located at a remote location. While the usermoves the robotic device to the drop-off location, the robotic devicecan capture sensory information about its current state and/or itssurrounding environment, at 1424 and/or 1430. Once at the drop-offlocation, the robotic device can switch back to autonomous execution, at1414. For example, the robotic device can execute a known arm motion,e.g., locating the item in a container on-board the robotic device,grabbing the item using a manipulating element, and positioning itsmanipulating element in a generic position for releasing the item. Therobotic device can determine for a second time that it needs a user todemonstrate a part of the skill, e.g., dropping the item on a shelf(1416: YES). The robotic device can prompt the user for anotherdemonstration, at 1418, and the user can move the manipulating elementsuch that the item is in position over the shelf. The robotic device cancapture sensory information again during the movement of themanipulating element, at 1424 and/or 1430. The robotic device can regaincontrol once again and autonomously open a gripper of the manipulatingelement to drop off the item on the shelf and then retract itsmanipulating element back into a resting position, at 1414. The roboticdevice can then determine that execution is complete (1417: YES) andgenerate a specialized model of the skill based on the information itcaptured during the two user demonstrations, at 1438.

Other examples of interactive learning session can include adapting anavigation action, e.g., to navigate through a specific hallway and/ordoorway, as discussed above.

In some embodiments, a user (or robotic device) can modify a model for askill after the skill is demonstrated and/or a model for the skill isgenerated. For example, a user can iterate through keyframes capturedduring a demonstration of the skill and determine whether to keep,modify, and/or delete those keyframes. By modifying and/or deleting oneor more keyframes, the user can modify a model for the skill generatedbased on those keyframes. Examples of using iterative and adaptiveversions of keyframe-based demonstrations are described in the articleentitled “Trajectories and keyframes for kinesthetic teaching: ahuman-robot interaction perspective,” authored by Akgun et al.,published in Proceedings of the 7th Annual ACM/IEEE InternationalConference on Human-Robot Interaction (2012), pp. 391-98, which isincorporated herein by reference. As another example, an interactivegraphical user interface (“GUI”) can be used to display keyframescaptured during a demonstration of a skill to a user such that the usercan indicate how much variance associated with a position, orientation,etc. of the components of the robotic device is acceptable during aplanned execution. Examples of using a GUI with keyframes are describedin the article entitled “An Evaluation of GUI and Kinesthetic TeachingMethods for Constrained-Keyframe Skills,” authored by Kurenkov et al.,published in IEEE/RSJ International Conference on Intelligent Robots andSystems (2015), accessible athttp://sim.ece.utexas.edu/static/papers/kurenkov_iros2015.pdf, which isincorporated herein by reference. These examples enable modification ofa model for a skill after the skill is learned. Systems and methodsdescribed herein further provide a learning template that enables arobotic device to plan and execute certain parts of the skill whileleaving other parts of the skill to user demonstration during anon-going execution of the skill. Moreover, systems and methods describedherein provide a robotic device that can determine when to collect dataand/or request demonstration of a part of a skill, e.g., autonomously orbased on information provided to the robotic device in advance of thelearning process.

FIG. 17 depicts a process for learning an environmental constraint. Asnoted above, a robotic device can be configured to learn anenvironmental constraint, which may be applicable to a set of skillsthat are learned and/or executed in a setting including theenvironmental constraint. At 1450, the robotic device can optionallyprompt a user for a selection of a type of constraint and/or an objectassociated with the constraint. The constraint type can be, for example,a barrier (e.g., a wall, a surface, a boundary), a size and/ordimension, a restricted area, a location of an object, etc. At 1452, therobotic device can receive a selection of a constraint type and/orobject. Optionally, at 1454, a user (or other robotic device) can guidethe robotic device through a movement to demonstrate the environmentalconstraint. For example, a user can demonstrate the size of a supply binto a robotic device by moving a manipulating element of the roboticdevice along one or more edges of the bin. As another example, a usercan demonstrate a location of a shelf by moving a manipulating elementof the robotic device along a surface of the shelf. During a movement ofthe robotic device, the robotic device can capture informationassociated with the constraint using one or more sensors (e.g., camera,laser, tactile, etc.), at 1456. At 1458, the robotic device can storethe captured information associated with the environmental constraintfor use with models of any skills executed in a setting including theenvironment constraint. In some embodiments, the robotic device mayinclude the information associated with the environmental constraint ineach model for a skill that is executed in the setting. Alternatively,the robotic device can be configured to reference the informationassociated with the environmental constraint when planning and/orexecuting a skill in the setting. In some embodiments, the environmentalconstraint may be added to a representation of the environment (e.g.,representation 1600).

In some embodiments, the robotic device can learn the environmentalconstraint without requiring a user demonstration or movement of therobotic device. For example, after receiving a selection of a constrainttype and/or object, at 1452, the robotic device can be configured toscan the environment for relevant information associated with theenvironmental constraint, at 1456. In some embodiments, the roboticdevice can present the information that it captures and/or determines tobe relevant to the environmental constraint to a user such that the usercan confirm and/or modify what information is relevant to theenvironmental constraint. The robotic device can then store the relevantinformation, at 1458.

In some embodiments, the robotic device can use a transport element(e.g., wheels or tracks) to move about an environment and learn one ormore environmental constraints. For example, the robotic device can movethrough a corridor or hallway, e.g., while undergoing a demonstrationand/or executing a skill, and learn that the corridor is busy duringcertain time frames. In some embodiments, the robotic device can learnand associate various environmental constraints and/or behaviors withdifferent conditions (e.g., time, location, etc.). For example, therobotic device can recognize that a corridor is busy based oninformation collected by its sensor(s), user input, and/or informationderived from sensed information (e.g., during a demonstration and/orexecution of a skill). As another example, the robotic device candetermine that it should respond with a “hello” or “good night” atdifferent times of day when a user enters a particular room.

In some embodiments, the robotic device can learn environmentalconstraints autonomously and interactively. For example, the roboticdevice can acquire an initial set of environmental constraints by movingthrough an environment (e.g., moving through a corridor or hallway), at1458. A user (e.g., a robot supervisor or local user) can review theconstraint and determine whether to adjust the constraint by directlymodifying the constraint and/or via demonstration, at 1459. For example,some constraints can be provided through an interaction with the roboticdevice (e.g., a demonstration), while other constraints (e.g., alocation of an object such as, for example, a distance a shelf extendsfrom a wall) can be provided through an input (e.g., to a user interfacesuch as, for example, user interface 240).

FIG. 21 depicts an example of information that flows from and feeds intoa map 2120 (e.g., a map of a building such as, for example, a hospital).The map 2120 can be stored and/or maintained by one or more roboticdevices, such as any of the robotic devices as described herein. The map2120 can be similar to map 1600, as depicted in FIG. 19. For example,the map 2120 can include one or more layer(s) 2122 such as, for example,a navigation layer, a static layer, a dynamic layer, and a social layer.

The map 2120 can provide information that a robotic device can usedwhile operating in a learning mode 2102 or an execution mode 2112. Forexample, a robotic device operating in the learning mode 2102 can accessthe map 2120 to obtain information regarding environmentalconstraint(s), object(s), and/or social context(s) (e.g., similar tostate information 331, 1732, object information 340, 1740, environmentalconstraint(s) 1754, social context 1632, etc.). Such information canenable the robotic device to determine a location of object(s), identifycharacteristic(s) of object(s), analyze one or more environmentalconstraint(s), select skill model(s) to use, prompt user(s) forinput(s), etc., as described herein. Additionally or alternatively, arobotic device operating in the execution mode 2112 can access the map2120 to obtain information regarding environmental constraint(s),object(s), and/or social context(s), and use that information toevaluate an environment, arbitrate between different behaviors orskills, determining which skills to execute, and/or adapt behavior orskills to suit a particular environment.

A robotic device operating in the learning mode 2102 can also provideinformation that adds to and/or changes information within the map 2120.For example, the robotic device can incorporate sensed information 2104(e.g., information collected by one or more sensor(s) of the roboticdevice) and/or derived information 2106 (e.g., information derived bythe robotic device based on, for example, analyzing sensed information2104) into the map 2120. The robotic device can incorporate suchinformation by adding the information to the map 2120 and/or adaptingexisting information in the map 2120. Additionally, the robotic deviceoperating in the execution mode 2112 can provide information (e.g.,sensed information 2114, derived information 2116) that adds to and/orchanges information within the map 2120.

As an example, a robotic device undergoing a demonstration (e.g., suchas that depicted in FIGS. 10 and 16) of how to navigate through adoorway can collect information during the demonstration that feeds intothe layer(s) 2122 of the map 2120. The doorway can be a specific doorwaythat exists at various locations throughout a building, e.g., asrepresented in the map 2122. Characteristics and/or properties of thedoorway can be recorded by various sensors on the robotic device and/orderived by the robotic device based on sensed information. For example,the robotic device can sense a size of the doorway and determine thatthe doorway is a tight doorway. The robotic device can add thisinformation regarding the doorway to the map 2120, e.g., using one ormore semantic labels, as raw or processed sensor information, as aderived rule, etc. The robotic device can also use the information toadapt a contextual layer of the map (e.g., a social or behavioral layerof the map, e.g., such as social layer 1630 depicted in FIG. 19). Forexample, the robotic device can determine to engage in specificbehavior(s) and/or action(s) based on the doorway being a tight doorway,including, for example, going through the doorway when the doorway hasbeen opened by a user (or other robotic device(s)) instead of goingthrough the doorway following and/or alongside a user (or other roboticdevice(s)), seeking a specific user familiar with the doorway to assistthe robotic device with navigating through that doorway (e.g., byholding open the doorway and/or guiding the robotic device throughvarious action(s)), etc.

In some embodiments, the map 2120 can be centrally maintained for one ormore robotic devices described herein. For example, the map 2120 can bestored on a remote compute device (e.g., a server) and centrally hostedfor a group of robotic devices that operate together, e.g., in ahospital. The map 2120 can be updated as information (e.g., sensedinformation 2104, 2114 and derived information 2106, 2116) is receivedfrom the group of robotic devices as those robotic devices operate inlearning mode 2102 and/or execution mode 2112, and be provided to eachof the robotic devices as needed to execute actions, behavior, etc. Asindividual robotic devices within the group learn new information suchthat one or more layer(s) 2122 of the map 2120 are adapted, thisinformation can be shared with the other robotic devices when theyencounter a similar environment and/or execute a similar skill (e.g.,action, behavior). In some embodiments, a local copy of the map 2120 canbe stored on each robotic device, which can be updated or synchronized(e.g., at predetermined intervals, during off hours or downtime) with acentrally maintained copy of the map 2120. By regularly updating the map(e.g., with newly collected information, such as sensed information2104, 2114 and derived information 2106, 2116) and/or sharing the mapbetween robotic devices, each robotic device can have access to a morecomprehensive map that provides it with more accurate information forinteracting with and/or executing skills within its surroundingenvironment.

In some embodiments, the map 2120 can be a mixed initiative map whereinformation (e.g., environmental constraint(s), rule(s), skill(s), etc.)can be learned (e.g., during learning 2102 and/or execution 2112) and/orprovided by a user (e.g., via user input 2130). For example, the roboticdevice can build on the map 2120 by incorporating information directlygiven by a user (e.g., a robot supervisor or local user), learnedthrough demonstration and/or execution, or interactively providedthrough a combination of demonstration and/or execution and user input(e.g., requested by a robotic device about an interaction in real-timeor retroactively). For example, a user can indicate to a robotic devicethat it should move slower through a region on the map 2120 (e.g., anarrow or busy hallway). In some embodiments, the robotic device canpresent a map to the user, and the user can draw the region on the mapand indicate that the robotic should move slower through that region.The robotic device may attempt moving through the region on the map 2120for a number of times, and learn that the region should be avoided atcertain times of day (e.g., due to overcrowding). Alternatively oradditionally, the robotic device can encounter a new situation that therobotic device is not familiar with handling autonomously (e.g., thehallway being sectioned off). In such instances, the robotic device cancommunicate with a user (e.g., a robot supervisor and/or local user) togain more information about the region and/or determine how to navigatearound the region. When requesting the user's input, the robotic devicecan ask broadly regarding the situation and/or provide a list ofexamples for a user to label (e.g., to enable the robotic device toautonomously derive an appropriate behavior). In some embodiments, therobotic device can detect and/or derive information regarding a regionwithout user input, and propose one or more rules associated with thatregion to a user for confirmation. In some embodiments, a user, uponviewing a proposed rule and/or other information from a robotic device,can modify the rule and/or other information before accepting it. FIG.24, described below, provides a more detailed description of such aprocess.

FIGS. 22-25 are flow diagrams that depict different avenues throughwhich a robotic device (e.g., any of the robotic devices describedherein including, for example, (e.g., robotic device 102, 110, 200, 400,etc.) can learn skill(s), environmental constraint(s), etc. In someembodiments, a robotic device operating according to the methodsdescribed herein can be constantly learning (e.g., operating in alearning mode). For example, the robotic device can be collecting,storing, analyzing, and/or updating information on a continuous basis,and be adapting and/or adding to its library of learned information(e.g., skills, behaviors, environmental constraints, map, etc.), as itnavigates through an environment and/or engages in certain behavior oractions. In other embodiments, a robotic device can switch betweenoperating in a learning mode or in an execution mode. While operating ina learning mode 2202, a robotic device such that those described hereincan learn via demonstration 2204, via execution 2206, via exploration2208, via derivation/user input 2210, and/or any combination thereof.

In some embodiments, the robotic device can learn through demonstration2204, e.g., as previously shown and described with reference to FIGS.8-10. For example, the robotic device can learn a new skill orenvironmental constraint through a demonstration of a skill using theLfD teaching process. The robotic device can analyze and extractinformation from past demonstrations of a skill in a human environment.As an example, a user can demonstrate to the robotic device how to moveits manipulation element (e.g., an arm) in a supply room with a humanand the robotic device. From the demonstration(s), the robotic devicecan define one or more environmental constraints that limit its motionwhen planning and/or executing the skill in the future to the motion ofthe existing demonstration(s). The robotic device can generate motionsand construct a graph, where the environment constraints are encoded,using existing demonstrations (e.g., sequences of keyframe(s)) eachoffering anode and path through an nth dimensional space of motion. Whenthe robotic device is then presented with a new environment and needs toadapt the skill to that new environment, the robotic device canefficiently sample from its existing set of demonstrated motions usingthe constructed graph to plan a motion for the new environment. Bybuilding in the learning of such environmental constraints into theinitial demonstrations of a skill, the robotic device can quickly adaptto new environments without requiring new environmental constraints tobe defined. In the new environment, the robotic device can scan theenvironment to obtain additional information regarding that environmentand/or object(s) (e.g., humans, supplies, doors, etc.) within theenvironment, and then use its existing skill model to plan a motiontrajectory through the new environment without deviating significantlyfrom its existing set of demonstrated motions.

In some embodiments, the robotic device can learn through demonstration2204 and user input 2210. For example, the robotic device can engage ininteractive learning with a user, e.g., with an interactive learningtemplate. As described above, with interactive learning, the roboticdevice and/or provided an initial set of skill models. The initial setof models can be developed in a factory setting, by other roboticdevice, and/or the robotic device itself in one or more settings. Whileoperating in a specific environment, the robotic device can adapt orspecialize this initial set of skills via interactive learning sessions,where the robotic device can autonomously perform certain portions of askill while leaving other portions to be demonstrated by a user. Furtherdetails of such an interactive learning process are described withreference to FIG. 16.

In some embodiments, the robotic device can learn while executing skills2206. For example, the robotic device can collect, store, analyze,and/or update information (e.g., at 774 in FIGS. 11, at 1304, 1305, and1307 in FIG. 13), as the robotic device executes a behavior, an action,etc. In an embodiment, a robotic device can execute a skill thatrequires the robotic device to move through a hallway several times aday. The robotic device can learn that, at certain times of day, thehallway has greater traffic. The robotic device can then adapt itsbehavior, for example, to avoid going through the hallway during thosetimes of high traffic (e.g., by going an alternative route and/orwaiting to go through the hallway at times of low traffic).

In some embodiments, the robotic device can learn via exploration 2208.FIG. 23 is a flow diagram depicting an example method of learningthrough exploration. The robotic device can scan an environment andcollect information on the environment and/or objects within theenvironment, at 2302. Optionally, based on the information the roboticdevice has collected, the robotic device can determine to explore theenvironment by executing a skill. To execute the skill, the roboticdevice can select an existing skill model, at 2304, plan the executionof the skill using the skill model, at 2306, and execute the skill, at2308. For example, the robotic device can scan an environment andidentify a doorway. The robotic device can determine to execute a skillincluding steps of opening a door and moving through the doorway. At2304, the robotic device can select an existing skill model that it hasbeen provided and/or learned to navigate through doorways. At 2306, therobotic device can generate a plan for opening the door and movingthrough the doorway using the existing skill, e.g., according to aprocess similar to the process discussed with reference to FIG. 11. Andat 2308, the robotic device can execute the skill according to thegenerated plan. While executing the skill, the robotic device cancollect information on the environment and/or its execution of theskill, and compare such information to information collected duringprevious interactions with doorways, at 2310. The robotic device canevaluate, based on this comparison, whether its interaction with thedoorway was a success or failure (e.g., based on success criterion, asdescribed above). The robotic device can optionally generate a skillmodel that is specific to the doorway based on its execution of theskill, at 2312.

In some embodiments, the robotic device may initiate execution of askill and determine that user input is required to further complete theskill. For example, as described above with reference to FIG. 16, therobotic device can initiate the execution of a skill and, upon reachinga part of the skill that the robotic device cannot determine how toexecute, solicit user input to build on its execution of the skill.

In some embodiments, the robotic device may execute a skill anddetermine that its execution has failed. For example, the robotic devicecan plan and/or execute a movement through a doorway but detect that itwas not able to move through the doorway (e.g., due to a failed attemptto open the door or navigate across a tight doorway). The robotic devicecan determine to scan the environment again (2314: YES), and thenre-plan and re-execute the skill, at 2306-2308. In some embodiments, therobotic device may select a different skill (e.g., a skill more specificto a tight doorway), at 2304, and then re-plan and re-execute based onthe different skill. The robotic device can continue to learn viaexploration by re-scanning, re-planning, and re-executing a skill untilthe robotic device determines that it has met a certain objective (e.g.,successfully executed the skill a predetermined number of times or ways,learned sufficient information regarding the environment, etc.). At2316, the robotic device can store the information it has collected ofthe environment and/or objects within the environment and optionally anymodels that it has adapted or generated based on executing skills withinthe environment.

In some embodiments, the robotic device may engage in learning byexploration 2208 with user input 2210. For example, the robotic devicecan explore a particular environment and/or execute one or more skillsin that environment to learn more information about that environment.While the robotic device is exploring, a user (e.g., a remote supervisoror a local user) can provide inputs to the robotic device based oninformation that the user perceives and/or learned about the environment(e.g., directly by being in the environment and/or through the roboticdevice). The user can then provide one or more inputs into the roboticdevice that further guide its exploration of the environment. Forexample, a robotic device that encounters a doorway may plan one or moremovements to navigate through the doorway. The robotic device, uponexecuting the movements, may fail to navigate through the doorway. Auser can view the failed attempt by the robotic device and provideinput(s) into the robotic device to guide the robotic device in a secondattempt to navigate through the doorway. For example, the user candetermine that the doorway is a tight doorway and indicate to therobotic device that it should seek a nearby user or robotic device tofirst open the door before navigating through the doorway. The roboticdevice, upon receiving the user input, can then seek a user to open thedoor before navigating through the doorway.

In some embodiments, the robotic device can learn via user input 2210 byreceiving information regarding an environment and/or objects within theenvironment from a user, or receiving initial sets of models, rules,etc. from a user. In some embodiments, a robotic device may initiallylearn a skill through a demonstration (e.g., as described above withreference to FIG. 10) and learn to adapt that skill based on a userinput. For example, a robotic device can learn a skill for moving asupply from a first location to a second location and then dropping offthe supply at the second location. During a user demonstration of theskill, the user can indicate to the robotic device that dropping off thesupply at the second location requires specific information unique to anenvironment or situation. The user can later provide inputs to therobotic device specifying, for different supplies and/or differentlocations, the unique information that the robotic device should observewhile dropping off the supply. For example, the user can provideinformation regarding a tag or a marker that identifies where therobotic device should drop off the supply.

In some embodiments, the robotic device can learn behavior or rules(e.g., arbitration algorithm(s) 1858, as described above) based onstored information and/or user inputs 2210. FIG. 24 is a flow chart ofan example method of robotic learning of behavior and/or rules. Therobotic device can analyze stored information (e.g., map(s), model(s),user input(s)), at 2402. The robotic device can derive behavior and/orrules based on its analysis of stored information, at 2404. For example,a user (e.g., robot supervisor or local user) can define a rule for arobotic device to be more conversational when more humans are near therobotic device. The robotic device may determine, based on its analysisof stored information, that more humans are near it during a certaintime of day (e.g., around lunchtime). Therefore, the robotic device mayfurther derive that it should be more conversational during that time ofday. In some embodiments, the robotic device can automatically implementthis behavior as a new rule. Alternatively, the robotic device canpropose being more conversational at that time of day as a new rule tothe user, at 2406. The robotic device can receive a user input inresponse to proposing the rule, at 2408. For example, the user may viewthe rule on a local and/or remote display of the robotic device andselect to accept or reject it. Based on the user input, the roboticdevice can modify its behavior and/or rules, at 2410. For example, ifthe user indicates that the rule is acceptable, the robotic device canstore the new rule and implement it in the future. Alternatively, if theuser indicates that the rule is acceptable but with a further tweak orchange (e.g., being more conversational around lunchtime but only in acafeteria), then the robotic device can adapt the rule and store theadapted rule for future implementation.

In some embodiments, the robotic device can learn a skill and/orenvironmental constraint by adapting an existing skill and/orenvironmental constraint based on a user input. FIG. 25 is a flowdiagram of an example method of adapting a skill and/or environmentalconstraint based on a user input. The robotic device can receive a userinput, at 2502. The robotic device can associate the user input with oneor more skill(s) and/or environmental constraints, at 2504. Suchassociation can be based on, for example, a further input by the user(e.g., the user specifying the skill and/or environmental constraintthat the input relates to) and/or a derivation by the robotic device(e.g., the robotic device determining that such an input is associatedwith a particular skill or environmental constraint based on one or morerules, attributes, etc. associated with the input and/or particularskill or environmental constraint. For example, a user can provide arobotic device with new information for identifying an object, such as aQR code, a barcode, a tag, etc. The user can specify that this newinformation is for a particular object, and the robotic device canassociate the new information with that object. The robotic device canfurther associate the new information for that object with one or moreskills that involve interactions with that object, e.g., a grabbing ormoving skill. The robotic device can optionally modify and/or generate askill or environmental constraint based on the input, at 2506-2508.

In some embodiments, the robotic device can enable a user toretroactively label information associated with a skill and/orenvironmental constraint. For example, the robotic can present a skillor environmental constraint to a user, e.g., via a user interface (e.g.,user interface 240) and/or a remote interface. The user can select toadd, modify, and/or remove a label associated with the skill orenvironmental constraint, and such input can be received at the roboticdevice, at 2502. The robotic device can then associate the input withthe skill or environmental constraint, at 2504, and modify and/orgenerate a new skill or environmental constraint based on the input, at2506-2508.

EXAMPLES

It will be appreciated that the present disclosure may include any oneand up to all of the following examples.

Example 1: An apparatus, comprising: a memory; a processor; amanipulating element; and a set of sensors, the processor operativelycoupled to the memory, the manipulating element, and the set of sensors,and configured to: obtain, via a subset of sensors from the set ofsensors, a representation of an environment; identify a plurality ofmarkers in the representation of the environment, each marker from theplurality of markers associated with a physical object from a pluralityof physical objects located in the environment; present informationindicating a position of each marker from the plurality of markers inthe representation of the environment; receive a selection of a set ofmarkers from the plurality of markers associated with a set of physicalobjects from the plurality of physical objects; obtain, for eachposition from a plurality of positions associated with a motion of themanipulating element in the environment, sensory information associatedwith the manipulating element, the motion of the manipulating elementassociated with a physical interaction between the manipulating elementand the set of physical objects; and generate, based on the sensoryinformation, a model configured to define movements of the manipulatingelement to execute the physical interaction between the manipulatingelement and the set of physical objects.

Example 2: The apparatus of Example 1, wherein the set of physicalobjects includes a human.

Example 3: The apparatus of any one of Examples 1-2, wherein themanipulating element includes an end effector configured to engage witha subset of physical objects from the set of physical objects.

Example 4: The apparatus of any one of Examples 1-3, wherein the subsetof sensors is a first subset of sensors, the processor configured toobtain the sensory information via a second subset of sensors from theset of sensors, the second subset of sensors different from the firstsubset of sensors.

Example 5: The apparatus of any one of Examples 1-4, wherein themanipulating element includes a plurality of movable components joinedvia a plurality of joints, the set of sensors including at least one of:a sensor configured to measure a force acting on a joint from theplurality of joints; or a sensor configured to detect an engagementbetween a movable component from the plurality of movable components anda physical object from the set of physical objects.

Example 6: The apparatus of Example 5, wherein the set of sensorsfurther includes a sensor configured to measure a position of the jointor the movable component, relative to a portion of the apparatus.

Example 7: The apparatus of Example 5, wherein the set of sensorsfurther includes at least one of: a light sensor, a temperature sensor,an audio capture device, and a camera.

Example 8: The apparatus of Example 1, wherein the manipulating elementincludes (i) a plurality of joints, and (ii) an end effector configuredto move a physical object from the set of physical objects, the set ofsensors including a sensor configured to measure a force placed on atleast one of the end effector or a joint from the plurality of jointscoupled to the end effector when the end effector is moving the physicalobject.

Example 9: The apparatus of any one of Examples 1-8, wherein the sensoryinformation includes sensor data associated with a set of features, theprocessor further configured to: receive a selection of a first subsetof features from the set of features, the processor configured togenerate the model based on sensor data associated with the first subsetof features and not based on sensor data associated with a second subsetof features from the set of features not included in the first set offeatures.

Example 10: The apparatus of Example 9, wherein the processor is furtherconfigured to prompt a user to select at least one feature from the setof features such that the processor receives the selection of the firstsubset of features in response to a selection made by the user.

Example 11: The apparatus of any one of Examples 1-10, wherein theplurality of markers are fiducial markers, and the representation of theenvironment is a visual representation of the environment.

Example 12: The apparatus of any one of Examples 1-11, wherein theprocessor is further configured to save the model and informationassociated with the model in the memory, the information associated withthe model including: (i) the set of markers, and (ii) the sensoryinformation.

Example 13: The apparatus of Example 1, wherein the manipulating elementincludes a plurality of joints, the sensory information including, foreach position from the plurality of positions associated with the motionof the manipulating element, information indicating a current state ofeach joint from the plurality of joints.

Example 14: The apparatus of any one of Examples 1-13, wherein theprocessor is further configured to prompt, after the presenting, a userto select at least one marker from the plurality of markers such thatthe processor receives the selection of the set of markers in responseto a selection made by the user.

Example 15: The apparatus of any one of Examples 1-14, wherein theprocessor is configured to obtain the representation of an environmentby scanning a region of interest in the environment using the set ofsensors.

Example 16: A method, comprising: obtaining, via a set of sensors, arepresentation of an environment; identifying a plurality of markers inthe representation of the environment, each marker from the plurality ofmarkers associated with a physical object from a plurality of physicalobjects located in the environment; presenting information indicating aposition of each marker from the plurality of markers in therepresentation of the environment; receiving, after the presenting, aselection of a set of markers from the plurality of markers associatedwith a set of physical objects from the plurality of physical objects;obtaining, for each position from a plurality of positions associatedwith a motion of a manipulating element in the environment, sensoryinformation associated with the manipulating element, the motion of themanipulating element associated with a physical interaction between themanipulating element and the set of physical objects; and generating,based on the sensory information, a model configured to define movementsof the manipulating element to execute the physical interaction betweenthe manipulating element and the set of physical objects.

Example 17: The method of Example 16, wherein the sensory informationincludes sensor data associated with a set of features, the methodfurther comprising: receiving a selection of a first subset of featuresfrom the set of features, the generating including generating the modelbased on sensor data associated with the first subset of features andnot based on sensor data associated with a second subset of featuresfrom the set of features not included in the first set of features.

Example 18: The method of Example 17, further comprising prompting auser to select at least one feature from the set of features such thatthe selection of the first subset of features is received in response toa selection made by the user.

Example 19: The method of any one of Examples 16-18, wherein theplurality of markers are fiducial markers, and the representation of theenvironment is a visual representation of the environment.

Example 20: The method of any one of Examples 16-19, wherein themanipulating element includes a plurality ofjoints, the sensoryinformation including, for each position from the plurality of positionsassociated with the motion of the manipulating element, informationindicating a current state of each joint from the plurality of joints.

Example 21: The method of any one of Examples 16-20, further comprisingprompting, after the presenting, a user to select at least one markerfrom the plurality of markers such that the selection of the set ofmarkers is received in response to a selection made by the user.

Example 22: The method of any one of Examples 16-21, wherein theobtaining the representation of the environment includes scanning aregion of interest in the environment using the set of sensors.

Example 23: A non-transitory processor-readable medium storing coderepresenting instructions to be executed by a processor, the codecomprising code to cause the processor to: obtain, via a set of sensors,a representation of an environment; identify a plurality of markers inthe representation of the environment, each marker from the plurality ofmarkers associated with a physical object from a plurality of physicalobjects located in the environment; present information indicating aposition of each marker from the plurality of markers in therepresentation of the environment; in response to receiving a selectionof a set of markers from the plurality of markers associated with a setof physical objects from the plurality of physical objects, identify amodel associated with executing a physical interaction between amanipulating element and the set of physical objects, the manipulatingelement including a plurality of joints and an end effector; andgenerate, using the model, a trajectory for the manipulating elementthat defines movements of the plurality of joints and the end effectorassociated with executing the physical interaction.

Example 24: The non-transitory processor-readable medium of Example 23,wherein the code to cause the processor to identify the model associatedwith executing the physical interaction includes code to cause theprocessor to prompt a user to identify the model.

Example 25: The non-transitory processor-readable medium of any one ofExamples 23-24, wherein the code to cause the processor to identify themodel associated with executing the physical interaction includes codeto cause the processor to identify the model based on the selection ofthe set of markers.

Example 26: The non-transitory processor-readable medium of any one ofExamples 23-25, further comprising code to cause the processor to:display to a user the trajectory for the manipulating element in therepresentation of the environment; receive, after the displaying, aninput from the user; and in response to the input indicating anacceptance of the trajectory for the manipulating element, implement themovements of the plurality of joints and the end effector to execute thephysical interaction.

Example 27: The non-transitory processor-readable medium of any one ofExamples 23-26, wherein the trajectory is a first trajectory, thenon-transitory processor-readable medium further comprising code tocause the processor to, in response to the input not indicating anacceptance of the trajectory for the manipulating element: modify a setof parameters associated with the model to produce a modified model; andgenerate, using the modified model, a second trajectory for themanipulating element.

Example 28: The non-transitory processor-readable medium of any one ofExamples 23-26, wherein the trajectory is a first trajectory and themodel is a first model, the non-transitory processor-readable mediumfurther comprising code to cause the processor to, in response to theinput not indicating an acceptance of the trajectory for themanipulating element: generate a second model based on sensor dataassociated with a set of features different from a set of features usedto generate the first model; and generate, using the second model, asecond trajectory for the manipulating element.

Example 29: The non-transitory processor-readable medium of any one ofExamples 23-28, further comprising code to cause the processor to:implement the movements of the plurality of joints and the end effectorto execute the physical interaction; obtain sensory informationassociated the execution of the physical interaction; and determinewhether the execution of the physical interaction meets a predefinedand/or learned success criterion based on the sensory information.

Example 30: The non-transitory processor-readable medium of Example 29,further comprising code to cause the processor to: in response todetermining that the execution of the physical interaction meets thepredefined and/or learned success criterion, generate a signalindicating that the physical interaction was successful; and in responseto determining that the execution of the physical interaction does notmeet the predefined and/or learned success criterion: modify the modelbased on the sensory information to produce a modified model; andgenerate, using the modified model, a second trajectory for themanipulating element.

Example 31: The non-transitory processor-readable medium of any one ofExamples 23-30, wherein the model is associated with (i) a stored set ofmarkers, (ii) sensory information indicating at least one of a positionor an orientation of the manipulating element at points along a storedtrajectory of the manipulating element associated with the stored set ofmarkers, and (iii) sensory information indicating a configuration of theplurality of joints at the points along the stored trajectory; the codeto cause the processor to generate the trajectory for the manipulatingelement includes code to cause the processor to: compute atransformation function between the set of markers and the stored set ofmarkers; transform, for each point, the at least one of the position orthe orientation of the manipulating element using the transformationfunction; determine, for each point, a planned configuration of theplurality of joints based on the configuration of the plurality ofjoints at the points along the stored trajectory; and determine, foreach point, a portion of the trajectory between that point and aconsecutive point based on the planned configuration of the plurality ofjoints for that point.

Example 32: The non-transitory processor-readable medium of any one ofExamples 23-30, wherein the model is associated with (i) a stored set ofmarkers, and (ii) sensory information indicating at least one of aposition or an orientation of the manipulating element at points along astored trajectory of the manipulating element associated with the storedset of markers; the code to cause the processor to generate thetrajectory for the manipulating element includes code to cause theprocessor to: compute a transformation function between the set ofmarkers and the stored set of markers; transform, for each point, the atleast one of the position or the orientation of the manipulating elementusing the transformation function; determine, for each point, a plannedconfiguration of the plurality of joints; and determine, for each point,a portion of the trajectory between that point and a consecutive pointbased on the planned configuration of the plurality of joints for thatpoint.

Example 33: The non-transitory processor-readable medium of any one ofclaims 23-32, further comprising code to cause the processor to:determine whether to change a location of the manipulating element basedon a distance between a first location of the manipulating element and alocation of a physical object from the set of physical objects; andmove, in response to determining to change a location of themanipulating element, the manipulating element from the first locationto a second location more proximal to the location of the physicalobject, the code to cause the processor to generate the trajectory forthe manipulating element includes code to cause the processor togenerate, after the moving, the trajectory based on the location of thephysical object relative to the second location of the manipulatingelement.

While various inventive embodiments have been described and illustratedherein, those of ordinary skill in the art will readily envision avariety of other means and/or structures for performing the functionand/or obtaining the results and/or one or more of the advantagesdescribed herein, and each of such variations and/or modifications isdeemed to be within the scope of the inventive embodiments describedherein. More generally, those skilled in the art will readily appreciatethat all parameters, dimensions, materials, and configurations describedherein are meant to be exemplary and that the actual parameters,dimensions, materials, and/or configurations will depend upon thespecific application or applications for which the inventive teachingsis/are used. Those skilled in the art will recognize, or be able toascertain using no more than routine experimentation, many equivalentsto the specific inventive embodiments described herein. It is,therefore, to be understood that the foregoing embodiments are presentedby way of example only and that, within the scope of the appended claimsand equivalents thereto; inventive embodiments may be practicedotherwise than as specifically described and claimed. Inventiveembodiments of the present disclosure are directed to each individualfeature, system, article, material, kit, and/or method described herein.In addition, any combination of two or more such features, systems,articles, materials, kits, and/or methods, if such features, systems,articles, materials, kits, and/or methods are not mutually inconsistent,is included within the inventive scope of the present disclosure.

Also, various inventive concepts may be embodied as one or more methods,of which an example has been provided. The acts performed as part of themethod may be ordered in any suitable way. Accordingly, embodiments maybe constructed in which acts are performed in an order different thanillustrated, which may include performing some acts simultaneously, eventhough shown as sequential acts in illustrative embodiments.

Some embodiments and/or methods described herein can be performed by adifferent software (executed on hardware), hardware, or a combinationthereof. Hardware modules may include, for example, a general-purposeprocessor, a field programmable gate array (FPGA), and/or an applicationspecific integrated circuit (ASIC). Software modules (executed onhardware) can be expressed in a variety of software languages (e.g.,computer code), including C, C++, Java™, Ruby, Visual Basic™, and/orother object-oriented, procedural, or other programming language anddevelopment tools. Examples of computer code include, but are notlimited to, micro-code or micro-instructions, machine instructions, suchas produced by a compiler, code used to produce a web service, and filescontaining higher-level instructions that are executed by a computerusing an interpreter. For example, embodiments may be implemented usingimperative programming languages (e.g., C, Fortran, etc.), functionalprogramming languages (Haskell, Erlang, etc.), logical programminglanguages (e.g., Prolog), object-oriented programming languages (e.g.,Java, C++, etc.) or other suitable programming languages and/ordevelopment tools. Additional examples of computer code include, but arenot limited to, control signals, encrypted code, and compressed code.

It will be understood that, although the terms first, second, third,etc. may be used herein to describe various elements, these elementsshould not be limited by these terms. These terms are only used todistinguish one element from another element. Thus, a first elementdiscussed herein could be termed a second element without departing fromthe teachings of the present disclosure.

The terminology used herein is for the purpose of describing particularembodiments only and is not intended to be limiting. As used herein, thesingular forms “a”, “an”, and “the” are intended to include the pluralforms as well, unless the context clearly indicates otherwise. It willbe further understood that the terms “comprises” and/or “comprising” or“includes” and/or “including” when used in this specification, specifythe presence of stated features, regions, integers, steps, operations,elements, and/or components, but do not preclude the presence oraddition of one or more other features, regions, integers, steps,operations, elements, components, and/or groups thereof.

1. A robotic device, comprising: a memory; a processor; a manipulatingelement; and a set of sensors, the processor operatively coupled to thememory, the manipulating element, and the set of sensors, the processorconfigured to: obtain, via a first subset of sensors from the set ofsensors, a representation of an environment; identify a set of markersin the representation of the environment, each marker from the set ofmarkers associated with a physical object from a set of physical objectslocated in the environment; obtain, via a second subset of sensors fromthe set of sensors, sensory information at each keyframe from aplurality of keyframes during a demonstration of a skill by the user inwhich the user demonstrates a physical interaction between themanipulating element and the set of physical objects, the set of sensoryinformation including information of a set of features associated withthe manipulating element and the environment, each keyframe of theplurality of keyframes representing a discrete point during thedemonstration of the skill with at least two adjacent keyframes beingseparated from one another by a time interval; present, via the userinterface and to the user, the set of features such that the user canselect a subset of the set of features that is relevant to learning theskill from the set of features; and generate a model for executing theskill using a subset of the set of sensory information that isassociated with the subset of features selected by the user.
 2. Arobotic device, comprising: a memory; a processor; a manipulatingelement; a transport element configured to move along a surface; and aset of sensors, the processor operatively coupled to the memory, themanipulating element, the transport element, and the set of sensors, theprocessor configured to: obtain, via the set of sensors, information ofan environment; select, based on an arbitration algorithm or an inputfrom a user, a skill to execute from a plurality of skills, theexecution of the skill including a physical interaction between themanipulating element and a physical object in the environment; generate,using a model for executing the skill, a plan to execute the skill basedon the information of the environment; move the manipulating element andthe transport element based on the plan to execute at least a firstportion of the skill while obtaining additional information of theenvironment; in response to detecting a predefined change in theenvironment based on the additional information obtained of theenvironment, interrupt the execution of the skill; prompt a user toprovide feedback based on the predefined change in the environment; andgenerate a plan to execute a second portion of the skill based on thefeedback.
 3. A robotic device, comprising: a memory; a processor; amanipulating element; a transport element configured to move along asurface; and a set of sensors, the processor operatively coupled to thememory, the manipulating element, the transport element, and the set ofsensors, the processor configured to: obtain, via the set of sensors,information of a first portion of an environment; select, based on anarbitration algorithm or an input from a user, a skill to execute from aplurality of skills, the execution of the skill including a physicalinteraction between the manipulating element and a physical object inthe environment; generate, using a model for executing at least a firstportion of the skill, a plan to execute the first portion of the skillbased on the information of the first portion of the environment; movethe manipulating element and the transport element based on the plan toexecute the first portion of the skill; obtain, via the set of sensors,information of a second portion of an environment that is different fromthe first portion of the environment; in response to determining thatuser input is required to execute a second portion of the skill in thesecond portion of the environment, prompt the user to provide at leastone input associated with executing the second portion of the skill; andgenerate a plan to execute the second portion of the skill based on theuser input.
 4. An apparatus, comprising: a manipulating element; atransport element configured to move along a surface; a set of sensors;a memory configured to store a map of an environment, the map includinga static layer identifying locations of a first plurality of constraintsin the environment and a dynamic layer identifying movements orbehaviors of a second plurality of constraints in the environment, eachconstraint from the first and second plurality of constraints beingassociated with a physical object in the environment; and a processoroperatively coupled to the memory, the manipulating element, thetransport element, and the set of sensors, the processor configured to:obtain, via the set of sensors, information of a portion of theenvironment; select a skill to execute from a plurality of skills;generate, using a model for executing the skill, a plan to execute theskill based on the information of the portion of the environment; movethe manipulating element and the transport element based on the plan toexecute the skill; obtain, after executing at least a portion of theskill, information of a constraint in the environment via the set ofsensors; determine a type of the constraint based on an input from auser or information of a set of physical objects stored in the memory;and store the information of the constraint and the type of constraintby updating the static and dynamic layers of the map.